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HUMAN GLYCINE TRANSPORTER 

This application is related to the following co-pending applications: "Glycine 
Transporter-Transfected Cells and Uses Thereof,", Attorney Docket No. 317743-105, 
Serial No. 08/655,836, filed May 31, 1996; "Pharmaceutical For Treatment Of 
Neurological And Neuropsychiatry Disorders," Attorney Docket No. 317743-103 Serial 
No. 08/656,063, filed May 31, 1996; "Pharmaceutical For Treatment of Neuropsychiatry 
Disorders," Attorney Docket No. 317743-106, Serial No. 08/655,912. filed May 31. 1996: 
and "Pharmaceutical For Treating Of Neurological and Neuropsychiatry Disorders." 
Attorney Docket No. 317743-107, Serial No. 08/655,847, filed May 31, 1996. 

The present invention relates to-nucleic acid encoding the "GlyT-2" member 
of the family of human glycine transporters, to the isolated protein encoded by the 
nucleic acid, and to the field of drug discovery. 

Synaptic transmission is a complex form of intercellular communication that 
involves a considerable array of specialized structures in both the pre- and post-synaptic 
neuron. High-affinity neurotransmitter transporters are one such component, located on 
the pre-synaptic terminal and surrounding glial cells (Kanner and Schuldiner. CRC 
Critical Reviews in Biochemwry 22: 1032, 1987). Transporters sequester 
neurotransmitter from the synapse, thereby regulating the concentration of 
neurotransmitter in the synapse, as well as its duration in the synapse, which together 
influence the magn.tude of synaptic transmission. By preventing the spread of transmitter 
to neighboring synapses, transporters maintain the fidelity of synaptic transmission. 
-Further, by sequestering released transmitter-^ transporters ~ 

allow for transmitter reutilization. 

Neurotransmitter transport is dependent on extracellular sodium and the 
voltage difference across the membrane: under conditions of intense neuronal firing, 
for example during a seizure, transporters can function in reverse, releasing 
neurotransmitter in a calcium-independent non-exocytotic manner (Attwell et al.. Neuron 
M: 401-407, 1993). Pharmacologic modulation of neurotransmitter transporters thus 
provides a means for modifying synaptic activity, which provides useful therapy for the 
treatment of neurological and psychiatric disturbances. 
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The amino acid glycine is a major neurotransmitter in the mammalian 
nervous system, functioning at both inhibitor)' and excitatory synapses. By the phrase 
"nervous system/' both the central and peripheral portions of the nervous system are 
intended. The distinct inhibitory and excitatory functions of glycine arc mediated by two 

5 different types of receptor, each of which is associated with a different class of glycine 
transporter. The inhibitory actions of glycine are mediated by glycine receptors that are 
sensitive to the convulsant alkaloid, strychnine, and are thus referred to as "strychnine- 
sensitive". Such receptors contain an intrinsic chloride channel that is opened upon 
binding of glycine to the receptor; by increasing chloride conductance, the threshold for 

10 firing of an action potential is increased. Strychnine-sensitive glycine receptors are found 
predominantly in the spinal cord and brainstem, and pharmacological agents that enhance 
the activation of such receptors will thus increase inhibitor)' neurotransmission in these 
regions. 

Glycine functions in excitatory transmission by modulating the actions of 
15 glutamate, the major excitatory neurotransmitter in the central nervous system. See 

Johnson and Ascher, Nature 325: 529-53 1, 1987; Fletcher et al., Glycine Transmission 
Otterson and Storm -Mathisen, eds., 1990, pp. 193-219. Specifically, glycine is an 
obligator)' co-agonist at the class of glutamate receptor termed N-methyl-D-aspartatc 
(NMDA) receptor. Activation of NMDA receptors on a neuron increases sodium and 
20 calcium conductance, which depolarizes the neuron, thereby increasing the likelihood that 
the neuron will fire an action potential. NMDA receptors are widely distributed 

throughout the brain, with a particularly high density in the cerebraj^cortcx and 

hippocampal formation. 

Molecularcloning has revealed the existencein mammalian brains of two 

25 classes of glycine transporters, termed GlyT-1 and GlyT-2. GlyT-l is found 

predominantly in the forebrain, and its distribution corresponds to that of glutamatergic 
pathways and NMDA receptors (Smith, et al., Neuron 8: 927-935, 1992). The 
distribution of GlyT-2 differs; this transporter is found predominantly in the brain stem 
and spinal cord, and its distribution corresponds closely to that of strychnine-sensitive 
30 glycine receptors. Liu et al, J. Biol Chem. 268: 22802-22808, 1993; Jursky and Nelson, 
J. N euro chem. 64: 1026-1033, 1995. These observations are consistent with the view 
that, by regulating the synaptic lev Is of glycine, GlyT-1 and GlyT-2 preferentially 
influence the activity of NMDA receptors and strychnine-sensitive glycine receptors, 
respectively. 
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Sequence comparisons of GlyT-l and GIyT-2 have revealed that these 
glycine transporters are members f a broader family of sodium-dependent 
neurotransmitter transporters, including, for example, transporters specific for y-amino-n- 
butyric acid (GAB A) and others. Uhl, Trends in Neuroscience 15: 265-268, 1992; Clark and 
5 Amara, BioEssays 15: 323-332. 1993, Overall, each of these transporters includes 12 
putative transmembrane domains that predominantly contain hydrophobic amino acids. 
Comparing rat GlyT-l to rat GlyT-2. using the Lipm an -Pearson FAST A algorithm, 
reveals a 51% amino acid sequence identity and a 55% nucleic acid sequence identity. 
Comparison of the sequence of human GlyT-l with rat GlyT-2 reveals a 51% amino acid 

10 sequence identity and a 53-55% nucleic acid sequence identity, with the range of values 
for nucleic acid sequence identity resulting from the existence of three variant forms of 

GlyT-l. 

Compounds that inhibit or activate glycine transporters would be expected to 
alter receptor function, and provide therapeutic benefits in a variety of disease states. For 

15 example, inhibition of GlyT-2 can be used to diminish the activity of neurons having 
strychnine-sensitive glycine receptors via increasing synaptic levels of glycine, thus 
diminishing the transmission of pain-related (i.e., nociceptive) information in the spinal 
cord, which has been shown to be mediated by these receptors. Yaksh, Pain 1 1 1-123. 
1989. Additionally, enhancing inhibitory glycinergic transmission through strychnine- 

20 sensitive glycine receptors in the spinal cord can be used to decrease muscle 

hyperactivity, which is useful in treating diseases or conditions associated with increased 
muscle contraction, such as spasticity, myoclonus (which relates to rapid muscle spasms), 
and epilepsy (Truong et al., Movement Disorders 3: 77-87, 1988: Becker. FASEB J, 4: 
2767-2-774^-1 990).— Spasticity that can be treated via modulation-of^gly cine-receptors is 

25 associated with epilepsy, stroke, head trauma, multiple sclerosis, spinal cord injury, 
dystonia, and other conditions of illness and injury of the nervous system. 
Summary of the Invention 

In a first embodiment, the invention provides a nucleic acid encoding a 
glycine transporter having at least about 96% sequence identity with the protein sequence 

30 of SEQ ID NO:27 or with a sequence corresponding to the protein sequence of SEQ ID 
NO:27 except that it has one or more of the following substitutions (1) Gly 102 to Ser. (2) 
Ser 124 to Phe, (3) He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to Asn, (6) Asp 463 to Asn f 
(7) Cys 610 to Tyr, (8) lie 611 to Val, (9) Phe 733 to Ser, (10) He 735 to Val, (1 1) Phc 245 to 
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Leu, (12) Val 305 to Leu. (13) Thr 36fi to lie or (14) Leu 40 " to Pro. Preferably, the 
sequence identity is at least about 97%, more preferably at least about 98%. yet more 
preferably at least about 99%, yet more preferably at least about 99.5%. In an 
embodiment of the invention, the sequence identity is 100%. Preferably, the encoded 
glycine transporter has no more than four amino acid differences in the region from 
amino acid 200 to 797 of reference protein sequence, where the reference sequence is 
SEQ ID NO:27 or of a sequence corresponding to the protein sequence of SEQ ID 
NO:27 except that it has one of the substitutions described above. More preferably, the 
encoded glycine transporter has no more than two such differences. 

The invention also provides a vector comprising the nucleic acid described 
above. In one embodiment, the vector is effective to express a glycine transporter mRNA 
in at least one of a bacterial cell or a eukaryotic cell. In another embodiment of the 
invention, the vector is effective to express the mRNA in at least one of a yeast cell, a 
mammalian cell or an avian cell. 
1 5 The invention further provides an isolated glycine transporter derived from 

transformed cells according to the invention, the transporter comprising the amino acid 
sequence encoded by the above-described nucleic acid or one to two contiguous portions 
of amino acid sequence encoded by such a nucleic acid, wherein the protein has glycine 
transporter activity and differs in sequence from the aligned segments of the rat 
transporter sequence. The phrase "contiguous sequence," as used herein, refers to 
uninterupted portions of the relevant reference nucleic acid or amino acid sequence. 
Preferably, the glycine transportei ^protein of the present invention differs in sequence 
from the aligned segments of the rat transporter sequence by at least two amino acids, 
more preferablyrat leasrfdur^mlno^dsT-Preferably, the contiguous sequences" 



20 



25 comprise at least about 600 amino acids, more preferably at least about 700 amino acids, 
more preferably at least about 750 amino acids. In one embodiment, the transporter 
protein comprises all of the protein sequence encoded by the above-described nucleic 
acid. Preferably, the transporter protein comprises amino acid sequence set forth in the 
protein sequence of SEQ ID NO:27 or a sequence corresponding to the protein sequence 

30 of SEQ ID NO:27 except that it has one or more of the following substitutions (I ) for 
Gly 102 , Ser, (2) for Ser' 24 , Phe, (3) for He 279 , Asn, (4) for Arg 393 , Gly, (5) for Lys 457 , 
Asn, (6) for Asp 463 , Asn, (7) for Cys 610 , Tyr, (8) for He 6 ", Val, (9) for Phc 733 , Ser, (10) 
for He 735 , Val, (11) for Phe 245 , Leu, (12) for Val 305 , Leu, (13) for Thr 366 , He or (14) for 
Leu 400 , Pro, or an amino acid sequence comprising one to two contiguous portions of 
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these sequences. In a preferred embodiment, the invention provides a glycine transporter 
and associated nucleic acids, vectors and methods, wherein the protein sequence 
comprises at least one of (1) Ser ,0 \ (2) Pfie 124 , (3) Asn 279 , (4) Gly 39 \ (5) Asn 457 r (6) 
Asn 46 \ (7) Tyr 610 , (8) Val 611 , (9) Ser 73 \ (10) Val 735 ? (II) Leu 245 , (12) Leu 305 . (13) lie 366 
5 and (14) Pro 400 . Preferably, the sequence comprises at least two of these amino acid 
residues, more preferably at least four, yet more preferably all of these amino acid 
residues: 

In a second embodiment, the invention also provides a nucleic acid encoding 
a transporter protein having at least about 99.5% sequence identity with all or one to two 
10 contiguous portions of the amino acid sequence of SEQ ID NO:27 or with one to two 
continous portions of an amino acid sequence corresponding to the protein sequence of 
SEQ ID NO:27 except that it has one or more of the following substitutions (I) Gly 102 to. 
Sen (2) Scr 124 to Phe, (3) lie 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to Asn, (G) Asp 463 
to Asn, (7) Cys 610 to Tyr, (8) He 611 to Val, (9) Phe 733 to Ser, (10) He 735 to Val, (11) 
15 Phe 245 to Leu, (12) Val 305 to Leu, (13) Thr 366 to lie or (14) Leu 400 to Pro, wherein the 
encoded protein has glycine transporter activity. Preferably, the contiguous sequences 
comprise at least about 600 amino acids, more preferably at least about 700 amino acids, 
more preferably at least about 750 amino acids. The invention also provides a vector 
comprising this nucleic acid. In one embodiment the vector is effective to express a 
glycine transporter mRNA in at least one of a prokaryotic cell such as a bacterial cell or 
a eukaiyotic cell. In another embodiment of the invention, the vector is effective to 
express the mRNA in at least one of a yeast cell, a mammalian cell or an avian cell. 

The invention additionally provides a cell comprising a first extrinsically- 
derived nucleic acid-according-to-the-first embodiment or a second extrinsically-derived 



nucleic acid encoding a transporter protein having at least about 99.5% sequence identity 
with one to two contiguous portions of the protein sequence of SEQ ID NO:27 or of a 
sequence corresponding to the protein sequence of SEQ ID NO:27 except that it has one 
or more of the following substitutions (1) Gly 102 to Ser, (2) Ser 124 to Phe, (3) lie 279 to 
Asn, (4) Arg 393 to Gly, (5) Lys 437 to Asn, (6) Asp 463 to Asn, (7) Cys 6, ° to Tyr, (8) lie 611 
30 to Val, (9) Phe 733 to Ser, (10) He 733 to Val, (1 1) Phe 243 to Leu, (12) Val 305 to Leu, (13) 
Thr 366 to He or (14) Leu 400 to Pro, wherein the encoded protein has glycine transporter 
activity. In one embodiment, the cell expresses a glycine transporter from the nucleic 
acid. Preferably, the nucleic acid is functionally associated with a promoter that is 
operative in the cell. In an embodiment of the invention, the promoter is an inducible 
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promoter. 

The invention also provides a method of producing a glycine transporter 
comprising growing the cells described in the previous paragraph. This method can 
further comprise at least one of (a) isolating membranes from said cells, which 
5 membranes comprise the glycine transporter or (b) extracting a protein fraction from the 
cells, which fraction comprises the glycine transporter. 

An embodiment of the invention provides a method for characterizing a 
bioactive agent for treatment of a nervous system disorder or condition or for identifying 
bioactivc agents for treatment of a nervous system disorder or condition, the method 

10 comprising (a) providing a first assay composition comprising (i) a cell as described 

above or (ii) an isolated glycine transporter protein comprising the amino acid sequence 
encoded by the first or second extrinsically-derivcd nucleic acids described above, (b) 
contacting the first assay composition with the bioactivc agent or a prospective bioactivc 
agent, and measuring the amount of glycine transport exhibited by the assay composition. 

15 Preferably, the method further comprises comparing the amount of glycine transport 

exhibited by the first assay composition with the amount of glycine transport exhibited by 
a second such assay composition that is treated the same as the First assay composition 
except that it is not contacted with the bioactive agent or prospective bioactive agent. 
The method can be used for characterizing bioactive agents where the nervous system 

20 disorder or condition is one of the group consisting of (a) pain, (b)spasticity, (c) 

myoclonus, (d) muscle spasm, (e) muscle hyperactivity or (f) epilepsy. In a preferred 
embodiment, the spasticity for which the bioactivc agent is characterized is associated 
with stroke, head trauma, neuronal cell death, multiple sclerosis, spinal cord injury, 

dystonia, Huntington's disease or~afnyotroph~ic"lateral sclerosis^ 

25 The invention further provides a nucleic acid that hybridizes with a reference 

nucleic acid sequence which is SEQ ID NO:26 or a sequence that varies from the nucleic 
acid sequence of SEQ ID NO:26 by having one or more of the following substitutions (a) 
T 6 to C, (b) G 304 to A, (c) C 371 to T, (d) C 571 to T, (e) T 836 to A, (0 A 11 16 to G, (g) 
A 1177 to G, (h) G ,3?l to C, (i) G 1387 to A, (j) G 1829 to A, (k) A 1831 to G, (1) G 2103 to A, 

30 (m) T 2198 to C, (n) A 2203 to G, (o) C 342 to G, (p) C 352 to T, (q) T 733 to C, (r) A 777 to G, 
(s) G 913 to C, (t) G 951 to A, (u) C 1097 to T or (v) T 1199 to C, under conditions of 
sufficient stringency to exclude hybridizations with (a) the sequence for a rat or mouse 
GlyT-2 transporter or (b) the sequence for a mammalian GlyT-1 transporter. Preferably, 
the nucleic acid sequence is at least about 18 nucleotides in length and has at least about 
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95% sequence identity with a sequence embedded in the reference nucleic acid sequence. 
Preferably the nucleic acid sequence is at least about 40 nucleotides in length, more 
preferably at least about 100 nucleotides in length. Preferably the nucleic acid sequence 
has at least about 97% sequence identity with the above-recited reference sequence, more 
preferably 99% sequence identity. Preferably, the nucleic acid is a PCR primer and the 
stringent conditions are PCR conditions effective to amplify a human GIyT-2 sequence 
but not to amplify (a) the sequence for a rat or mouse GlyT-2 transporter or (b) the 
sequence for a mammalian GlyT-l transporter. 

Further the invention provides a nucleic acid of at least about 18 nucleotides 
in length comprising a contiguous sequence from the coding or noncoding strand of a 
human GlyT-2 gene orcDNA, wherein the contiguous sequence has at least 1 sequence 
difference when compared with the rat GlyT-2 gene sequence that aligns with the 
contiguous sequence. Preferably the nucleic acid sequence is at least about 40 
nucleotides in length, more preferably at least about 100 nucleotides in length. 
Preferably, the contiguous sequence has at least two differences, more preferably 3 
differences when compared with the rat GIyT-2 gene sequence that aligns with the 
contiguous sequence. 

Still further, the invention provides an antisense molecule comprising a 
contiguous sequence from a coding or non-coding strand of a human gene or cDNA for 
GIyT-2 which is effective when administered to a cell, tissue, organ or animal to reduce 
the expression of GIyT-2 in the cell or in a cell of the tissue, organ or animal, wherein 
the contiguous sequence has at least 1 sequence difference when compared with the rat 
GJyT-2 gene sequence that aligns with said contiguous sequence. Preferably, the 
-contiguous-sequence has at least two differencesrmore-preferably-3 differences when 
compared with the rat GIyT-2 gene sequence that aligns with the contiguous sequence. 
The phrase "antisense molecule," is used herein to refer to a molecule designed to bind 
genomic DNA or mRNA to interfere in transcription or translation, including interfering 
with mRNA stability. Preferably, the contiguous sequence is at least about 15 
nucleotides in length. Preferably, the contiguous stretch is included in the coding or non- 
coding strand of the reference nucleic acid sequence. Preferably, the contiguous stretch is 
in the coding or non-coding strand of the nucleic acid sequence of SEQ ID NO:2G. The 
invention further provides an expression vector comprising such an antisense molecule. 

The invention also provides a method of reducing GlyT-2 expression in a 
tissue or cell comprising applying to the tissue or cell a GlyT-2 expression reducing 
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effective amount of such an antisense molecule or a GlyT-2 expression reducing effective 
amount of an expression vector for expressing such an antisense molecule in a tissue or 
cell. Alternatively, the invention provides a method of treating a nervous system disorder 
or condition comprising applying to a tissue or cell of a human patient a nervous system 
5 disorder or condition treating effective amount of such an antisense molecule or a 

nervous system disorder or condition treating effective amount of an expression vector 
for expressing such an antisense molecule in a tissue or cell. 

Further, the invention provides a method for detecting whether an animal has 
autoimmune antibodies against a glycine transporter, the method comprising contacting an 

10 antibody preparation from the animal or a body fluid from the animal with a polypeptide 
antigen comprising a glycine transporter or derived from the glycine transporter. 
Preferably, the polypeptide antigen comprises a contiguous sequence encoded by the 
protein sequence of SEQ ID NO:27 or with a sequence corresponding to the protein 
sequence of SEQ ID NO:27 except that it has one or more of the following substitutions 

15 (1) Gly 102 to Ser, (2) Ser 124 to Phe, (3) He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to 

Asn, (6) Asp 463 to Asn, (7) Cys 610 to Tyr, (8) lie 61 ' to Val, (9) Phe 733 to Ser, (10) He 735 
to Val, (11) Phe 245 to Leu, (12) Val 305 to Leu, (13) Thr 366 to lie or (14) Leu 400 to Pro. 
Preferably, the contiguous sequence is at least about six amino acids in length, more 
preferably at least about ten amino acids in length, still more preferably at least about 

20 fifteen amino acids in length. In one embodiment of the invention, the peptide antigen is 
selective for antibodies against either a GlyT-l transporter or a a GlyT-2 transporter. 

Brief Description of the Drawings _^__„ 

Figure 1 shows the alignment of several gene fragments of the human GlyT- 

2"geneT 

25 Figure 2 illustrates which fragment clones were used to construct the clone 

incorporating the nucleic acid sequence of SEQ ID NO:20, a full-length clone of the 
human GlyT-2 gene. 

Figure 3 shows a comparison between the nucleic acid sequence of SEQ ID 
NO: 18 and the rat GlyT-2 sequence. 
30 Figure 4 shows a comparison between the amino acid sequence of SEQ ID 

NO: 19 and the rat GlyT-2 sequence. 

Figure 5 shows the measurement of glycine transport in QT-6 cells either 
transfected with a human GlyT-2 expression vector or mock trans fected. 

Figure 6 shows the c ncentration dependence of glycine transport in QT-6 
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cells transfectcd with human GlyT-2. 
Definitions 

For the purposes of this application, the following terms shall have the 
meaning set forth below. 
O Bioactivc agent 

A bioactivc agent is a substance such as a chemical that can act on a cell, virus, tissue, 
organ or organism, including but not limited to drugs (i.e. pharmaceuticals) to create a 
change in the functioning of the cell, virus, organ or organism. Preferably, the organism 
is a mammal more preferably a human. In a preferred embodiment of the invention, the 
method of identifying bioactive agents of the invention is applied to organic molecules 
having molecular weight of about 1500 or less, 
o extrinsically-dcrivcd nucleic acid 

Extrinsically-derived nucleic acids are nucleic acids found in a cell that were introduced 
into the cell, a parent or ancestor of the cell, or a transgenic animal from which the cell 
15 is derived through a recombinant technology. 

o extrinsic promoter functionally associated with a nucleic acid 
An extrinsic promoter for a protein-encoding nucleic acid is a promoter distinct from that 
used in nature to express a nucleic acid for that protein. A promoter is functionally 
associated with the nucleic acid if in a cell that is compatable with the promoter the 
promoter can act to allow the transcription of the nucleic acid, 
o nucleic acid-specific property 

Nucleic acid-s pecifi c properties are properties that can be used to distinguish differing 
nucleic acid molecules. Such properties include, without limitation (i) the nucleotide 
sequence-of-all-or-a-portion-of-the molecule, (ii) the size of the moleculerforinstance 



20 



25 



30 



determined by electrophoresis, (iii) the fragmentation pattern generated by treatment with 
chemicals that fragment nucleic acid or generated by nucleases and (iv) the ability of the 
molecule or fragments thereof to hybridize with defined nucleic acid probes or to 
generate amplicons with defined primers, 
o prospective agent 

Prospective agents are substances which are being tested by the screening method of the 
invention to determine if they affect glycine transport, 
o Sequence identity 

"Identity," as known in the art, is a relationship between two or more p lypeptide 
sequences or two or more polynucleotide sequences, as determined by comparing the 
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sequences, particularly, as determined by the match between strings of such sequences. 
"Identity" is readily calculated by known methods (Computational Molecular Biology. 
Lesk, A.M.. cd., Oxford University Press. New York. I98X; Biocomputing: Informatics 
and Genome Projects. Smith, D.W., ed., Academic Press, New York, 1993; Computer 
5 Analysis of Sequence Data, Part I, Griffin, A.M.. and Griffin, H.G., eds., Humana Press, 
New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinjc, G. f Academic 
Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Dcvereux, J., eds.. M 
Stockton Press. New York, 1991). While there exist a number of methods to measure 
identity between two sequences, the term is well known to skilled artisans (sec, for 
10 example, Sequence Analysis in Molecular Biology; Sequence Analysis Primer; and 
Carillo, R, and Lipman, D. ? SIAM J. Applied Math., 48: 1073 (1988)). Methods 
commonly employed to determine identity between sequences include, but are not limited 
to those disclosed in Carillo, H., and Lipman, D.. SIAM J. Applied Math., 48:1073 
(1988) or, preferably, in Needleman and Wunsch. J. Mol. Bio),, 48: 443-445, 1970, 
15 wherein the parameters are as set in version 2 of DNASIS (Hitachi Software Engineering 
Co., San Bruno, CA). Computer programs for determining identity are publicly available. 
Preferred computer program methods to determine identity between two sequences 
include, but are not limited to, GCG program package (Devereux, J. ? et at.. Nucleic Acids 
Research 12(1): 387 (1984)), BLASTP, BLASTN, and FASTA (Atschul, S.F. et aL J. 
20 Molec. Biol. 215: 403-410 (1990)). The BLAST X program is publicly available from 
NCB1 (blastfolncbi.nlm.nih.gov) and other sources (BLAST Manual, Altschul, S.. ct aL 
NCBI NLM NIH Bethesda, MD 20894; Altschul, S.. et aL J. Mol. Biol. 215: 403-410 
(1990)). 

Detailed Descrintion of th^lnvCTtion 
25 The GlyT-2 nucleic acid sequence of SEQ ID NOS:l 8 and 26 or the 

corresponding encoded protein sequences of SEQ ID NOS:19 and 27, arc human relatives 
of the rat GlyT-2 sequence reported in Liu et aL J- Bioi Chem. 268: 22802-22808. 1992. 
SEQ ID NO:21, the GlyT-2 protein sequence encoded by the nucleid acid sequence of 
SEQ ID NO:20 5 differs from the amino acid sequences of SEQ ID NOS:19 and 27, most 
30 likely reflecting variant forms of human GlyT-2. Additional sequences set forth in SEQ 
IDs 1-34 reflect still further variations. These variations primarily arise from the use of 
cDNA from pooled mRNA for several donors to generate the clones. In total, the 
various human GlyT-2 -derived nucleic acids that have been isolated reveal the following 
sequence variations: 
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Nucleotide variations 


Encoded Amino 
Acid Variations 


Corresponding 
Amino Acid in 
Rat 


GAT 6 (from SEQ ID NOS:18 and 
26) to GAC (from SEQ ID NO:3) 


NONE (Asp 2 to Asp) 


Asp 


A 304 GC (from SEQ ID NO: 18) to 
GGC (from SEQ JD NOS:20 and 
26) 


Scr 102 to Gly 


Scr 


CCC 342 (from SEQ ID NOS:18 
and 26) to CCG (from SEQ ID 
NO: 33) 


NONE (Pro 114 to Pro) 


Pro 


C 352 TG (from SEQ ID NOS:18 
and 26) to TTG (from SEQ ID 
NO: 31) 


NONE (Leu 1 " 1 to Leu) 


Leu 


TI 37I T (from SEQ ID NO:20) to 
TCT (from SEQ ID NOS:18 and 
26) 


Phc 12J to Ser 


Ala 


C 57, GA (from SEQ ID NOS:18 
and 26) to TGA (from SEQ ID 
NO: 7) 


Arg 191 to STOP 


Arg 


I 7 "TC (from SEQ ID NOS:18 
ana zoj to t J t (trom ocy ID 
NO: 31) 


Phc 245 to Leu 


Phe 


CCA 777 (from SEQ ID N0S:I8 
and 26) to CCG (from SEQ ID 

NO: 33) 


NONE (Pro 259 to Pro) 


Pro 


AI 836 C (from SEQ ID NOS:18 

and 26) to AAC (from SEQ ID 
NO:20) 


- - He 279 to Asn 


He 








G 913 TA (from SEQ ID NOS:18 
and 26) to CTA (from SEQ ID 
NO: 35) 


Val 305 to Leu 


Val 


ACG 95! (from SEQ ID NOS:18 
and 26) to ACA (from SEQ ID 

NO- 9Q an/1 'X'W 


NONE (Thr 317 to Thr) 


Thr 


AC ,097 A (from SEQ ID NOS:18 
and 26) to ATA (from SEQ ID 
NO: 31) 


Thr 366 to lie 


Thr 


GAG"" (from SEQ ID NO:20) to 
GAA (from SEQ ID NOS:18 and 
26) 


NONE (Glu 372 to Glu) 


Glu 
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Nucleotide variations 


Encoded Amino 
Acid Variations 


Corresponding 
Amino Aciil in 

• «mviu III 

Rat 




G I,77 GG (from SEQ ID NO:5) to 
AGG (from SEQ ID NOS:18 and 
26) 


Gly- W to Arg 


Arg 


5 


Cl""C (from SEQ ID NOS:18 
and 26) to CCC (from SEQ ID 
NO: 33) 


Leu 400 to Pro 


Leu 




A AC 1371 (from SEQ ID NO: 10) to 
A AG (from SEQ ID NOS:18 and 
26) 


Asn 457 lo Lys 


Lys 


10 


G ,387 AT (from SEQ ID NOS:18 
and 26) to AAT (from SEQ ID 
NO:12) 


Asp 463 to Asn 


Asp 


15 


TG ,829 C (from SEQ ID NOS:18 
and 26) to TAC (from SEQ ID 
NO:22) 


Cys 610 to Tyr 


Cys 




A'* 3I TT (from SEQ ID NOS:18 
and 26) to GTT (from SEQ ID 
NO:20) 


He 6 " to Val 


lie 


20 


GAG. 2103 (from SEQ ID NOS:I8 
and 26) to GAA (from SEQ ID 
NO:24) 


NONE (Glu 701 to Glu) 


Glu 




TI 2,9 *T (from SEQ ID NOS:18 
and 26) to TCT (from SEQ ID 
-NO:24) 


Phc 733 to Ser 


Phe 


25 


A 2203 TA (from SEQ ID NOS:18 


He 735 to Val 


He 




and~26) to GTA (from SEQ ID 
NO:22) 







Irrespective of the source of this variation., the point variations in peptide sequence, 
excepting the insertion of the stop codon, are believed not to adversely affect the 
functioning of GlyT-2. The GlyT-2 protein sequence of SEQ ID NO: 19 and SEQ ID 
NO: 27 are especially most preferred, with SEQ ID NO: 27 most preferred. The nucleic 
acid sequence of SEQ ID NO:26 is believed to represent the major consensus sequence. 

The above-described variations primarily reflect sequence variations between 
human individuals. The material used to generate the nucleic acid sequences described 
above comprised pools from either twenty-six or ninety-two individuals, depending on 
the particular nucleic acid sequence. The use of pooled source material, together with the 
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prevalence of silent or conservative substitutions, support the conclusi n that the 
variations are reflective of human-derived variations rather than mutations generated by 
the amplification reactions. 

The relationship between the human nucleotide sequence of SEQ ID NO: 18 
and the rat nucleotide sequence for GlyT-2 ? and between the protein sequences that they 
encode, is as set forth in the tables below. The relatedness values set forth in these tables 
was determined using the FASTA computer program described by Pearson and Lipman, 
Proc. NaiL Acad ScL USA 85: 2444-2448, 1988. 



1 i\ 


Nucleotide Sequence 
(numbered as in SEQ ID NO: 18) 


Percent Identity 




,nt 1-2.197 (whole sequence) 


- .89 . . 




nt 1-600 


82.5 




ni ou-j /u 


78 


15 


nt 600-2397 


91.2 




Amino Acid Sequence 
(numbered as in SEQ ID NO: 19) 


Percent Identity 




aa 1-797 


94.4 




aa 1-150 


77.1 


20 


aa 1-200 


80.3 




aa 150-797 


98.5 




aa 200-797 


99.2 



25 



Nucleic A cid - encoding elvcine transporter 

To construct non-naturally occurring glycine transporter-encoding nucleic 
acids, the native sequences can be used as a starting point and modified to suit particular 
needs. For instance, the sequences can be mutated to incorporate useful restriction sites. 
See Maniatis et al. Molecular Cloning, a Laboratory Manual (Cold Spring Harbor Press^ 
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1989). Such restricti n sites can be used to create "cassettes", or regions of nucleic acid 
sequence that arc facilcly substituted using restriction enzymes and ligation reactions. 
The cassettes can be used to substitute synthetic sequences encoding mutated glycine 
transporter amino acid sequences. Alternatively, the glycine transporter-encoding 
5 sequence can be substantially or fully synthetic. See, for example, Goeddel ct aL Prac. 
Natl. Acad. ScL USA, 76. 106-110, 1979. For recombinant expression purposes, codon 
usage preferences for the organism in which such a nucleic acid is to be expressed are 
advantageously considered in designing a synthetic glycine transporter-encoding nucleic 
acid. For example, a nucleic acid sequence incorporating prokaryotic codon preferences 

10 can be designed from a mammalian-derived sequence using a software program such as 
Oligo-4, available from National Biosciences, Inc. (Plymouth, MN). 

The nucleic acid sequence embodiments of the invention are preferably 
deoxyribonucleic acid sequences, preferably double-stranded deoxyribonucleic acid 
sequences. However, they can also be ribonucleic acid sequences. 

15 Numerous methods arc known to delete sequence from or mutate nucleic 

acid sequences that encode a protein and to confirm the function of the proteins encoded 
by these deleted or mutated sequences. Accordingly, the invention also relates to a 
mutated or deleted version of a human nucleic acid sequence that encodes a protein that 
retains the ability to bind specifically to glycine and to transport glycine across a 

20 membrane. These analogs can have N-terminaL C-tcrminal or internal deletions, so long 
as GlyT-2 function is retained. The remaining human GlyT-2 protein sequence will 
preferably have no more than about 4 amino acid variations, preferably no more than 2 
amino acid variations, more preferably no more than 1 amino acid variation, relative to 
the proteifTsequence orSEQ~lD~NO;27 or with a sequence corresponding to the protein 

25 sequence of SEQ ID NO:27 except that it has one or more of the following substitutions 
(1) Gly 102 to Ser, (2) Ser 124 to Phe, (3) He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to 
Asn, (6) Asp 463 to Asn, (7) Cys 610 to Tyr, (8) lie 611 to Val, (9) Phe 733 to Ser, (10) lie 735 
to Val, (11) Phe 245 to Leu, (12) Val 305 to Leu, (13) Thr 366 to lie, or (14) Leu 400 to Pro. 
More preferably, the variations are relative to the protein sequence of SEQ ID NOS:19 or 

30 27. still more preferably SEQ ID NO:27. In one preferred embodiment, the protein 
embodiments of the invention are defined relative to the protein sequence of SEQ ID 
NO:27 or with a sequenc corr sponding to the protein sequence of SEQ ID NO:27 
except that it has one or more of the following substitutions (1) Gly 102 to Ser, (2) Ser 124 
to Phe, (3) He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to Asn, (6) Asp 463 to Asn, (7) 



WO 98/07854 



PCT/US97/14637 



15- 



.611 



10 



15 



20 



25 



30 



Cys 610 to Tyr, (8) lie 611 to Val, (9) Phc 733 to Ser, or (10) He 735 to Val. The point 
variations arc preferably conservative point variations. Preferably, the analogs will have 
at least about 96% sequence identity, preferably at least about 97%. more preferably at 
least about 98%. still more preferably at least about 99%, yet still more preferably at least 
about 99.5%, to the protein sequence of SEQ ID NO:27 or with a sequence 
corresponding to the protein sequence of SEQ ID NO;27 except that it has one or more 
of the following substitutions (1) Gly 102 to Ser, (2) Ser 124 to Phc, (3) lie 279 to Asn, (4) 
Arg 393 to Gly, (5) Lys 457 to Asn, (6) Asp 463 to Asn, (7) Cys 610 to Tyr, (8) lie 611 to Val, 
(9) Phc 733 to Ser, (10) He 735 to Val, (1 1) Phe 245 to Leu, (12) Val 305 to Leu, (13) Thr 366 to 
He or (14) Leu 400 to Pro. More preferably, the variations are relative to the protein 
sequence of SEQ ID NOS:19 or 27, still more preferably SEQ ID NO:27.. 
Mutational and deletional approaches can be applied to all of the nucleic acid sequences . 
of the invention that express human GlyT-2 proteins. As discussed above, conservative 
mutations are preferred. Such conservative mutations include mutations that switch one 
amino acid for another within one of the following groups: 



1. 



3. 
4. 



Small aliphatic, nonpolar or slightly polar residues: Ala, Ser, 
Thr, Pro and Gly; 

Polar, negatively charged residues and their amides: Asp ? Asn r 
Glu and Gin; 

Polar, positively charged residues: His, Arg and Lys; 

Large aliphatic, nonpolar residues: Met, Leu, He. Val and Cys: 

and 



5. Aromatic residues: Phe, Tyr and Trp. 
A preferred listing-of-conservative-variationsHs the following: 



Original Residue 


Variation 


Ala 


Gly, Ser 


Arg 


Lys 


Asn 


Gin, His 


Asp 


Glu 


Cys 


Ser 


Gin 


Asn 


Glu 


Asp 
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Orioinsil RfMtiriut* 

\_/ 1 1 fillet! ntsiuut 


Variat inn 


Glv 


A 1 a Prn 


His 


Acn Ciln 

.{LOU, VJ111 


He 


1 pu Vnl 

JUGU, V tXl 


Leu 


lie, Vol 


I tic 


A a>A f2 I fl 111 

/\rg, vjin. vjiu 


iviei 


1 Alt 1 iir Ilia 

L«eu, iyr, lie 


Phe 


Met, Leu, Tyr 


Ser 


Thr 


Thr 


Ser 


Tip 


Tyr 


Tyr 


Tip, Phe 


Val 


He. Leu 



The types of variations selected may be based on the analysis of the frequencies of amino 

15 acid variations between homologous proteins of different species developed by Schulz et 
al. T Principles of Protein Structure, Springer-Vcrlag. I978 ? on the analyses of structure- 
forming potentials developed by Chou and Fasman, Biochemistry 13, 21 L 1974 and Adv. 
Enzymol, 47, 45-149, 1978, and on the analysis of hydrophobicity patterns in proteins 
developed by Eisenberg et al., Proc. NotL Acad. Sci. USA 81, 140-144, 1984; Kyte & 

20 Doolittlc; J, Molec. Biol. 157, 105-132, J 9S I L and Goldman et al., Ann. Rev. Biophys. 

Chem. 15, 321-353. 1986. All of the references of this paragraph are incorporated herein 
in their entirety by reference. 

Since the ten identified point variations which create amino acid substitutions 
between the various human GlyT-2 mRNAs identified herein are believed to be useful in 

25 creating functional GlyT-2, proteins incorporating all combinations of these point 
variations are believed to be functional. These variations are within the invention. 

For the purposes of this application, a nucleic acid of the invention is 
"isolated" if it has been separated from other macromolecuies of the cell or tissue from 
which it is derived. Preferably, the composition containing the nucleic acid is at least 

30 about 10-fold enriched, with respect to nucleic acid content, over the composition of the 
source cells. Preferably, the nucleic acid is substantially pure, meaning purity of at least 
about 60% w/w with respect to other nucleic acids, more preferably about 80%, still 
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more preferably about 90%. yet more preferably about 95%. 
Hybridization Probes 

It will be recognized that many dclctional or mutational analogs of nucleic 
acid sequences for a glycine transporter will be effective hybridization probes for glycine 
5 transporter-encoding nucleic acid. Accordingly, the invention relates to nucleic acid 
sequences that hybridize with such glycine transporter-encoding nucleic acid sequences 
under stringent conditions. Preferably, the nucleic acid sequence hybridizes with the 
nucleic acid sequence of of SEQ ID NO:26 or with a nucleic acid sequence that varies 
therefrom by one or more of the following substitutions (a) T 6 to C, (b) G 304 to A, (c) 
10 C 371 to T, (d) C 571 to T, (e) T 836 to A r (0 A 1116 to G ? (g) A n77 to G, (h) G n71 to C (i) 
G 1387 to A, G) G ,m to A, (k) A 1831 to G, (I) G 2103 to A, (m) T 2,9K to C, (n) A 2203 to G, 

(o) C 342 to G ? (p) C 352 to T. (q) T 733 to C (r) A 7 " to G r (s) G 913 to C, (t) G 951 to A, (u) 

C I097 toTor(v)T I,99 toC. In one embodiment, the nucleic acid (or the functional 
equivalent) embodiments of the invention are defined relative to the nucleic acid 
15 sequence of of SEQ ID NO:26 or with a nucleic acid sequence that varies therefrom by 
one or more of the following substitutions (a) T 6 to C : (b) G 304 to A, (c) C 371 to T, (d) 
C 571 to T, (e) T 836 to A, (0 A 1116 to G, (g) A n77 to G ; (h) G 1371 to C, (i) G 1387 to A, (j) 
G 1829 to A, (k) A 1831 to G, (I) G 2103 to A, (m) T 2198 to C, or (n) A 2203 to G. 

"Stringent conditions'* refers to conditions that allow for the hybridization of 
20 substantially related nucleic acid sequences. For instance, such conditions will generally 
allow hybridization of sequence with at least about 85% sequence identity, preferably 
with at least about 90% sequence identity, more preferably with at least about 95% 
sequence identity. Such hybridization conditions are described by Sambrook et aL 

Molecular Cloning: A Laboratory Mam/a/~2nd^d;rCold^pring-H arbor Press, 1989. 

25 Hybridization conditions and probes can be adjusted in well-characterized ways to 
achieve selective hybridization of human-derived probes. 

Nucleic acid molecules that wilJ hybridize to a glycine transporter-encoding 
nucleic acid under stringent conditions can be identified functionally, using methods 
outlined above, or by using for example the hybridization rules reviewed in Sambrook ct 
30 al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press. 1989. 

Without limitation, examples of the uses for hybridization probes includ : 
histochemical uses such as identifying tissues that express the human GlyT-2 transporter; 
measuring mRNA levels, for instance to identify a sample's tissue type or to identify cells 
that express abnormal levels of glycine transporter; and detecting polymorphisms in the 
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glycine transporter gene. RNA hybridization procedures are described in Maniatis et al. 
Molecular Cloning, a Laboratory Manual (Cold Spring Harbor Press, 1989). 
PCR Primers 

Rules for designing polymerase chain reaction ("PCR") primers are now 
5 established, as reviewed by PCR Protocols, Cold Spring Harbor Press, 1991. Degenerate 
primers, i.e., preparations of primers that arc heterogeneous at given sequence locations, 
can be designed to amplify nucleic acid sequences that arc highly homologous to, but not 
identical to, a human GlyT-2 nucleic acid. Strategies are now available that allow for 
only one of the primers to be required to specifically hybridize with a known sequence. 

10 See, Froman et al. r Proc. NatL Acad. ScL USA 85: 8998, 1988 and Loh el al. Science 
243: 217, 1989. For example, appropriate nucleic acid primers can be ligatcd to the 
nucleic acid sought to be amplified to provide the hybridization partner for one of the 
primers. In this way, only one of the primers need be based on the sequence of the 
nucleic acid sought to be amplified. 

15 PCR methods of amplifying nucleic acid will utilize at least two primers. 

One of these primers will be capable of hybridizing to a first strand of the nucleic acid to 
be amplified and of priming enzyme-driven nucleic acid synthesis in a first direction. 
The other will be capable of hybridizing the reciprocal sequence of the first strand (if the 
sequence to be amplified is single stranded, this sequence will initially be hypothetical, 

20 but will be synthesized in the first amplification cycle) and of priming nucleic acid 

synthesis from that strand in the direction opposite the first direction and towards the site 
of hybridizationjbr the first primer. Conditions for conducting such_ampl_ifi.cations. 
particularly under preferred stringent hybridization conditions, arc well known. See, for 

^Mipl^PCRrPr^tocols, ; Cold Spring Harbor Press, 199T 

25 Vectors 

A suitable expression vector is capable of fostering expression of the 
included GIyT-2 encoding DNA in a host cell, which can be eukaryotic, fungal, or 
prokaryotic. Suitable expression vectors include pRc/CMV (Invitrogen, San Diego, CA), 
pRc/RSV (Invitrogen), pcDNA3 (Invitrogen), Zap Express Vector (Stratagene Cloning 
30 Systems, LaJolla, CA); pBk/CMV or pBk-RSV vectors (Stratagene), Bluescript II SK +/- 
Phagemid Vectors (Stratagene), LacSwitch (Stratagene), pMAM and pMAM neo 
(Clontech, Palo Alto, CA), pKSVIO (Pharmacia, Piscataway, NJ), pCRscript (Stratagene) 
and pCR2.1 (Invitrogen), among others. Useful yeast expression systems include, for 
exampl , pYEUra3 (Clontech). Useful baculovirus vectors include several viral vectors 
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from Invitrogen (San Diego, CA) such as pVL1393, pVLI392, P BluBac2 f pBluBacHis A, 

B or C, and pbacPAC6 (from Clontech). 

Cells 

In one embodiment of the invention, the transporter is preferably expressed 
in a mammalian cell line, preferably a transformed cell line with an established cell ulture 
history. In this embodiment, particularly preferred cell lines include COS-1. COS-7, 
LM(tk-). HcLa. HEK293, CHO, Rat-1 and NIH3T3. Other preferred cells include avian 
cells such as QT-6 cells. Other cells that can be used include insect cells such as 
drosophila cells, fish cells, amphibian cells and reptilian cells. 

In another embodiment, the transporter is expressed in a cell line that is more 
inexpensively maintained and grown than are mammalian cell lines, such as a bacterial 
cell line or a yeast cell line. 
Isolated Glycine Transporter 

The invention also provides for the human GlyT-2 proteins encoded by any 
of the nucleic acids of the invention preferably in a purity of at least about 80% with 
respect to proteins, preferably 90%, more preferably 95%. The purities are achieved, for 
example, by applying protein purification methods, such as those described below, to a 
lysate of a recombinant cell according to the invention. 

The human GlyT-2 variants of the above paragraphs can be used to create 
organisms or cells that produce human GlyT-2 activity. Purification methods, including 
associated molecular biology methods, are described below. 
Method of Producing Glycine Transports 

One simplified method of isolating polypeptides synthesized by an organism 
under-the-direction-of-one of-the nucleic acids of the invention istorccomtinantly- 



express a fusion protein wherein the fusion partner is facilely affinity purified. For 
instance, the fusion partner can be glutathione S-transferasc, which is encoded on 
commercial expression vectors (e.g., vector pGEX4T3, available from Pharmacia, 
Piscataway, NJ). The fusion protein can then be purified on a glutathione affinity 
column (for instance, that available from Pharmacia, Piscataway, New Jersey). 
Additional fusion partners are available for example in various expression vectors sold bv 
Invitrogen (Carlsbad, CA). Of course, the recombinant polypeptides can be affinity 
purified without such a fusion partner using an appropriate antibody that binds to GlyT-2. 
Methods of producing such antibodies are available to those of ordinary skill in light of 
the ample description herein of GlyT-2 expression systems and known antibody 



WO 98/07854 



PCT/US97/14637 



-20- 

production methods. See. for example. Ausubcl et aL Short Protocols in Molecular 
Biology, John Wiley & Sons, New York. 1992. If fusion proteins are used, the fusion 
partner can be removed by partial proteolytic digestion approaches that preferentially 
attack unstructured regions such as the linkers between the fusion partner and* GlyT-2. 

5 The linkers can be designed to lack structure, for instance using the rules for secondary- 
structure forming potential developed, for instance, by Chou and Fasman. Biochemistry 
13, 211, 1974 and Chou and Fasman, Adv. in Enzymol. 47 ? 45-147, 1978. The linker can 
also be designed to incorporate protease target amino acids, such as. arginine and lysine 
residues, the amino acids that define the sites cleaved by trypsin, or such as a target 

10 sequence for cnterokinasc, for example AspAspAspAspLys. which is cleaved after the 
lysine residue. To create the linkers, standard synthetic approaches for making 
oligonucleotides can be employed together with standard subcloning methodologies. 
Other fusion partners besides GST can be used. Procedures that utilize eukaryotic cells, 
particularly mammalian cells, arc preferred since these cells will post-translationally 

15 modify the protein to create molecules highly similar to or functionally identical to native 
proteins. 

Additional purification techniques can be applied, including without 
limitation, preparative electrophoresis, FPLC (Pharmacia, Uppsala, Sweden), HPLC (e.g., 
using gel filtration, reverse-phase or mildly hydrophobic columns), gel filtration, 
20 differential precipitation (for instance, "salting out" precipitations), ion-exchange 
chromatography and affinity chromatography. 

Because GlyT-2 is a membrane protein, which by analogy to related 
transporter proteins is believed to have twelve transmembrane sequences, isolation 
methods will often utilize~detergents, generally non^ionic detergents, to maintain the 
25 appropriate secondary and tertiary structure of the protein. See, for example, Lopez- 
Corcuera et aL 1 Biol. Chem. 266: 24809-24814, 1991. For a description of methods 
for re-intcgrating a solubilized transporter into a membrane, see Lopez-Corcuera et aL J. 
Biol. Chem. 266: 24809-24814, 1991. 

The isolation of GlyT-2 can comprise isolating membranes from cells that 
30 have been transformed to express GlyT-2. Preferably, such cells express GlyT-2 in 

sufficient copy number such that the amount of GlyT-2 in a membrane fraction is at least 
about 10-fold higher than that found in comparable membranes from cells that naturally 
express GlyT-2, more preferably the amount is at least about 100-fold higher. 

Preferably, the protein is substantially pure, meaning a purity of at least 60% 
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wt/wt with respect to other proteins. For the purposes of this application, GlyT-2 is 
"isolated" if it has been separated from other proteins or other macromolccules of the cell 
or tissue from which it is derived. Preferably, the composition containing GlyT-2 is at 
least about 10-fold enriched, preferably at least about 100-fold, with respect to protein 
content, over the composition of the source cells. 
Expressio n of GlvT-2 by RNA Insertion 

It will be recognized that human GlyT-2 can be expressed by the simple 
method of inserting mRNA into a cell. RNA for these uses can be prepared by sub- 
cloning the nucleic acid encoding a protein with GlyT-2 activity into a vector containing 
a promoter for high efficiency in vitro transcription, such as a SPG or T7 RNA 
polymerase promoter. RNA production from the vector can be conducted, for instance, 
with the method described in Ausubel et al. f Short Protocols in Molecular Biology, John 
Wiley & Sons, New York, 1992, pp. 10-63 to 10-65. Insertion of RNA into Xenopus- 
derived oocytes is described, for instance, in Liu et al. FEBS Letters 305: 1 10- 1 14. 1992 
and Bannon et al., J. Neurochem. 54: 706-708, 1990. 

Alternatively, it will be recognized that human GlyT-2 can be expressed by 
the simple method of inserting mRNA into an in vitro translation system, which can be a 
membrane-containing translation system. Expression of proteins in vitro is described, for - 
instance, in Ausubel et al., Short Protocols in Molecular Biology, John Wiley & Sons. 
New York, 1992, pp. 10-63 to 10-65. See, also, Guastella et aL Science 249: 1303- 
1306, 1990 (in vitro expression of a transporter). The use of subcellular membranous 
material to produce membrane proteins in vitro is described in Walter and Blobcl, Meth. 
Enzymol 96: 84, 1983 (for rabbit reticulocyte translation system) and Spiess and Lodish, 

Cell 44: J 77, 1986 (for wheat germ-translation-system): 

Method of Characterizing or Identifying aeent 

A method for the analysis of or screening for a bioactive agent for treatment 
of a disease or condition associated with a nervous system disorder or condition 
comprises culturing separately first and second cells, wherein the first and second cells 
are preferably of the same species, more preferably of the same strain thereof, and 
comprise an exogenous nucleic acid encoding a glycine transporter as described herein. 
The nervous system disorders or conditions for which the agent can be used for treatment 
include, but are not limited to, (a) pain, (b) myoclonus, (c) muscle spasm, (d) muscle 
hyperactivity, (e) epilepsy or (0 spasticity such as that associated with stroke, head 
trauma, neuronal cell death, multiple sclerosis, spinal cord injury, dystonia, Huntington's 
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disease or amyotrophic lateral sclerosis. In this method, the first cell is contacted with 
the bioactive agent or a prospective agent, which is preferably a compound, such as a 
peptide or an organic compound in the presence of glycine, which preferably incorporates 
a radioisotope, such as 3 H or U C. The contacted first cell is then tested for enhancement 
5 or inhibition of glycine transport into the first cell as compared to glycine transport into 
the second cell that was not contacted with the compound (i.e., the control cell). Such 
analysis or screening preferably includes activities of finding, learning, discovering, 
determining, identifying, or ascertaining. 

Alternatively, the assay can utilize a composition comprising an isolated 

10 GlyT-2 transporter in place of cells. Preferably, such preparation of isolated transporter 
will comprise membrane or lipid bilayer, preferably in vesicles, which vesicles have an 
inside and an outside across which transport can be measured. Sec, for example, Kanner. 
Biochemistry 17: 1207-1211, 1978. 

A bioactive agent is an enhancer of glycine transport uptake if at the end of 

15 the test the amount of intracellular, intravesicle or otherwise transported glycine is greater 
in the agent-contacted composition than in the non -agent-contacted composition; 
conversely, a bioactive agent is an inhibitor of glycine transport if the amount of 
intracellular or intravesicle glycine is greater in the non-agent-contacted composition as 
compared to the other. Preferably, the difference in glycine uptake between a tested first 

20 composition and a control second composition is at least about two-fold; more preferably, 
the difference is at least about five-fold: most preferably, the difference is at least about 
ten-fold or greater. 

A bioactive agent that is an inhibitor or an enhancer with respect to the 

GlyT=2nransporter may have a neutral or~6pposite~efiect with anotheTglycine transporter. 

25 such as one of the GlyT-1 transporters. Preferred bioactive agents have specificity to 
enhance or inhibit the GlyT-2 transporter and have neutral or negligible effect on other 
glycine transporters. Preferably, a bioactive agent has at least an order of magnitude 
greater potency, reflected in a concentration dependent parameter such as the IC 50 value, 
in inhibiting or activating glycine uptake mediated by the GlyT-2 transporter as compared 

30 to its effect on the second glycine transporter. More preferred agents have greater 

potencies of at least about 100-fold for one of the glycine transporters as compared to the 
other. 

The bioactive agent can be any compound, material, composition, mixture, or 
chemical, that can be presented to a glycine transporter in a form that allows for the 
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agent to diffuse so as to contact the transporter. Such bioactive agents include but are 
not limited to polypeptides preferably of two up to about 25 amino acids in length, more 
preferably from two to about ten, yet more preferably from two to about five amino acids 
in length. Other suitable bioactive agents in the context of the present invention include 
5 small organic compounds, preferably of molecular weight between about 100 daltons and 
about 5,000 daltons, and arc composed of such functionalities as alkyl, aryh alkenc. 
alkyne, halo, cyano and other groups, including hcteroatoms or not. Such organic 
compounds can be carbohydrates, including simple sugars, amino or imino acids, nucleic 
acids, steroids, and others. The chemicals tested as prospective agents can be prepared 
10 using combinatorial chemical processes known in the art or conventional means for 
chemical synthesis. Preferably, bioactive agents are useful as drugs for treatment of 
nervous system disorders or conditions. 

Some compounds that inhibit GlyT-1 or GlyT-2 mediated transport also bind 
to the glycine binding site on the strychnine-sensitive receptor, or to the glycine binding 
15 site on the NMDA receptor. Such binding to the strychnine-sensitive receptor can be 
identified by a binding assay whereby, for example, radiolabeled strychnine is placed in 
contact with a preparation of strychnine-sensitive receptors, such as can be prepared from 
a membrane fraction from spinal cord or brain stem tissue. A membrane fraction can be 
prepared using conventional means, including, for example, methods of homogenization 
20 and centrifugation. 

Such binding to the NMDA receptor can be identified by a binding assay 
w ^ r ^y^^ xam P ,e . radiolabeled glycine is placed in contact with a preparation of 
NMDA receptors, such as can be prepared from a membrane fraction from neuronal cells 

or-brain-tissuer-Grimwood et al., Molec. Pharmacol-* h923^930rl992r J The-NMDA 

receptors located in such membranes are treated using mild detergent, such as about 0.1% 
to about 0.5% saponin, to remove any endogenous glycine or glutamate. 

The ligand used in such a binding assay is radiolabeled with any detectable 
isotope, such as radioactive isotopes of carbon or hydrogen. Specific binding of the 
radiolabeled ligand is then determined by subtracting the radioactivity due to non-specific 
binding from that which is due to total (i.e., specific and non-specific) binding of the 
radiolabeled ligand. The radioactivity due to non-specific binding is determined by 
measuring the am unt of radiolabel associated with a strychnine-sensitive or NMDA 
receptor-containing membrane fraction that has been contacted with both radiolabeled 
ligand and a significant excess of non-radiolabeled ligand, such as a 100-fold excess. 
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The radioactivity due to total binding of the radiolabeled ligand is determined by 
measuring the amount of radiolabel bound to the receptor preparation in the absence of 
non-radiolabeled ligand. For the NMDA receptor, one can also measure binding to the 
glycine site on the receptor using labeled analogs of amino acids, such as. for example, 
5 dichlorokynurcnic acid or L-689,560. See, for example, Grimwood et aL Molecular 
Pharmacol,, 49: 923-930, 1992. 

Functional ion-flux assays are used to measure the effect of compounds 
identified by the present invention in enhancing or inhibiting calcium flux (for NMDA 
receptor preparations) or chloride flux (for strychnine-sensitive receptor preparations). 

10 This test is performed on cell cultures that have membrane-bound NMDA receptors or 

strychinine-sensitive receptors and glycine transporters. Such cells include neuronal cells 
generally, including those of the brain stem and spinal cord, and cell lines derived 
therefrom, and any other cell that has been induced or transfected to express NMDA 
receptors or strychnine-sensitive receptors. Calcium used in such a test is commonly the 

15 45 Ca isotope, although other calcium measuring techniques can be used as well, such as 
calcium -associated fluorescence, which can be fluorescence associated with a calcium 
chelator, and the like. Chloride used in such a test usually includes the isotope 36 C1. By 
whatever method the calcium or chloride is monitored, ion flux can be enhanced or 
inhibited as a result of the discrete addition of a bioactive agent of the present invention. 

20 An advantage of this system is that it allows one to monitor the net effect on NMDA 

receptor or strychnine-sensitive receptor function of a compound that interacts with both 
the glycine site on a receptor and on a glycine transporter. 

GlyT-2 inhibitors that are also strychnine-sensitive receptor agonists act in 
the above^desenbed indications by increasing glycine concentrations at the strychnine- 

25 sensitive receptor-expressing synapses via inhibition of the glycine transporter, and via 
directly enhancing strychnine-sensitive receptor activity. Glycine transporter inhibitors 
that are also strychnine-sensitive receptor antagonists can nonetheless retain activity in 
treating these indications, for example if the increase in glycine due to glycine transport 
inhibition prevails over the strychnine-sensitive receptor antagonism. Where the 

30 strychnine-sensitive receptor antagonist activity prevails over the effect of increased 

extracellular glycine resulting from inhibition of the glycine transporter, these compounds 
are useful in treating conditions associated with decreased muscle activity such as 
myasthenia gravis. 

As discussed above, the bioactive agents of the invention can have a number 
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of pharmacological actions. The relative effectiveness of the compounds can be assessed 
in a number of ways, including the following: 

1. Comparing the activity mediated through GIyT-1 and GlyT-2 
transporters. This testing identifies bioactivc agents (a) that are more active against 

5 GIyT-1 transporters and thus more useful in treating or preventing schizophrenia, 

increasing cognition and enhancing memory or (b) that are more active against GlyT-2 
transporters and thus more useful in treating or preventing epilepsy, pain or spasticity. 

2. Testing for strychnine-sensitive receptor or NMDA receptor binding. 
This test establishes whether there is sufficient binding at this site to warrant further 

10 examination of the pharmacological effect of such binding. 

3. Testing the activity of the compounds in enhancing or diminishing ion 
fluxes in primary tissue culture, for example chloride ion fluxes mediated by strychnine- 
sensitive receptors or calcium ion fluxes mediated by NMDA receptors. A bioactive 
agent that increases ion flux either (a) has little or no antagonist activity at the 

15 strychnine-sensitive receptor and should not affect the potentiation of glycine activity 

through GlyT-2 transporter inhibition or (b) r if marked increases are observed over results 
with comparative GlyT-2 inhibitors that have little direct interaction with strychnine- 
sensitive receptors, then the agent is a receptor agonist. 

In some cases, the agent analysis method of the invention will be used to 

10 characterize whether a bioactive agent is useful in treating an indication in which NMDA 
receptors and GlyT-1 transporters are implicated. In this case, generally, a lower measure 
of activity with respect to strychnine-sensitive receptors and GlyT-2 transporters is more 
desirable. 

Antisense Therapies 

J5 One aspect of the present invention is directed to the use of "antisense" 

nucleic acid to treat neurological indications such as those identified above. The 
approach involves the use of an antisense molecule designed to bind mRNA coding for a 
GlyT-2, thereby stopping or inhibiting the translation of the mRNA. or to bind to the 
GlyT-2 gene to interfere with its transcription. For discussion of the design of nucleotide 
0 sequences that bind genomic DNA to interfere with transcription, see Helene, 

Anti-Cancer Drug Design 6, 569, 1991. Once the sequence of the mRNA sought to be 
bound is known, an antisense molecule can be designed that binds the sense strand by the 
Watson-Crick base-pairing rules, forming a duplex structure analogous to the DNA 
double helix. Gene Regulation: Biology of Antisense RNA and DNA, Erikson and Ixzant, 
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cds., Raven Press, New York, 1991: Helene. Anti-Cancer Drug Design, 6:569 (1991); 
Crooke, Anti-Cancer Drug Design 6, 609, 1991. 

A serious barrier to fully exploiting antisensc technology is the problem of 
efficiently introducing into cells a sufficient number of antisense molecules to effectively 
5 interfere with the translation of the targeted mRNA or the function of DNA. One method 
that has been employed to overcome this problem is to covalently modify the 5* or the 3' 
end of the antisense polynucleic acid molecule with hydrophobic substituents. These 
modified nucleic acids generally gain access to the cells interior with greater efficiency. 
See, for example, Boutorin et al., FEBS Lett. 23,1382-1390, 1989; Shea ct al, Nucleic 

10 Adds Res. 18, 3777-3783, 1990. Additionally, the phosphate backbone of the antisensc 
molecules has been modified to remove or diminish negative charge (sec, for example. 
Agris et al., Biochemistry 25, 6268, 1986; Cazenavc and Helene in Antisense Nucleic 
Acids and Proteins: Fundamentals and Applications, Mol and Van der Krol, eds.. p. 47 
et seq. f Marcel Dekker, New York, 1991) or the purine or pyrimidine bases have been 

15 modified (see, for example, Antisense Nucleic Acids and Proteins: Fundamentals and 
Applications, Mol and Van der Krol, eds., p. 47 et seq. t Marcel Dekker, New York, 
1991; Milligan et al. in Gene Therapy For Neoplastic Diseases, Huber and Laso. eds., p. 
228 et seq., New York Academy of Sciences, New York. 1994). Other methods to 
overcome the cell penetration barrier include incorporating antisense polynucleic acid 

20 sequences into expression vectors that can be inserted into the cell in low copy number, 
but which in the cell can direct the cellular machinery to synthesize more substantial 
amounts of antisense polynucleic molecules. See, for example. Farhood et al., Ann. N.Y. 
Acad. Sci. 716, 23, 1994. This strategy includes the use of recombinant viruses that have 
an expression site into which Ihe^ antisense sequencc~fiasl>een incorporated. See. e.g., 

25 Boris-Lawrie and Temin, Ann. N.Y. Acad. Sci., 716:59 (1994). Others have tried to 
increase membrane permeability by neutralizing the negative charges on antisensc 
molecules or other nucleic acid molecules with polycations. See, e.g. Wu and Wu, 
Biochemistry, 27:887-892, 1988; Behr et al., Proc. Natl. Acad Sci U.S.A. 86:6982-6986, 
1989. 

30 For gene therapy such as antisense therapy, medical workers often try to 

incorporate, into one or more cell types of an organism, a DNA vector capable of 
directing the synthesis of a protein missing from the cell or useful to the cell or organism 
when expressed in greater amounts. The methods for introducing DNA to cause a cell to 
produce a new protein or a greater amount of a protein are called "transfection" methods. 
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See ; generally, Neoplastic Diseases, Huber and Lazo, eds., New York Academy of 
Science, New York, 1994: Feigner, Adv. Drug Detiv. Rev., 5:163 (1990): McLachlin. et 
al., Progr. Nucl. Acicb Res. Mol. Biol., 38:91 (1990): Karlsson, S. Blood, 78:2481 (1991); 
Einerhand and Valerio, Curr. Top. Microbiol. Immunol., 177:217-235 (1992): Makdisi el 
aL Prog. Liver Dis., 10:1 (1992): Litzingcr and Huang. Biochim. Biophys. Acta, 
1113:201 (1992): Morsy et al, J.A.M.A., 270:2338 (1993); Dorudi et al., British J. 
Surgery, 80:566 (1993). 

Other general methods of incorporating nucleic acids into cells include 
calcium phosphate precipitation of nucleic acid and incubation with the target cells 
(Graham and Van der Eb, Virology, 52:456, 1983), co-incubation of nucleic acid, 
DEAE-dexlran and cells (Sompayrac and Danna, Proc. Natl. Acad. Sci., 12:7575. 1981), 
electroporation of cells in the presence of nucleic acid (Potter et al., Proc. Natl. Acad. 
Sci., 81:7161-7165, 1984), incorporating nucleic acid into virus coats to create 
transfection vehicles (Gitman et al., Proc. Natl. Acad. Sci. U.S.A., 82:7309-7313. 1985) 
and incubating cells with nucleic acid incorporated into liposomes (Wang and Huang, 
Proc. Natl, Acad. Sci., 84:7851-7855, 1987). One approach to gene therapy is to 
incorporate the gene sought to be introduced into the cell into a virus, such as a herpes 
virus, adenovirus, parvovirus or a retrovirus. See, for instance, Akli et al., Nature 
Genetics 3, 224, 1993. 

The nucleic acid compositions of the invention can be, for example, 
administered orally, topically, rectally, nasally, vaginally, by inhalation, for example by 

use of an aerosol, or parenteral^, e.g. intramuscularly, subcutaneously, intraperitoneal^. 

intraventricularly, or intravenously. The nucleic acid compositions can be administered 

alone, or they can be combined with a pharmaceutically-acccptable carrier or excipient 

25 according to standard pharmaceutical practice. For the oral mode of administration, the 
nucleic acid compositions can be used in the form of tablets, capsules, lozenges, troches, 
powders, syrups, elixirs, aqueous solutions and suspensions, and the like. In the case of 
tablets, carriers that can be used include lactose, sodium citrate and salts of phosphoric 
acid. Various disintegrants such as starch, and lubricating agents such as magnesium 
30 stearate, sodium lauryl sulfate and talc, are commonly used in tablets. For oral 

administration in capsule form, useful diluents are lactose and high molecular weight 
polyethylene glycols. When aqueous suspensions are required for oral use, the nucleic 
acid compositions can be combined with emulsifying and suspending agents. If desired, 
certain sweetening and/or flavoring agents can be added. For parenteral administration, 
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sterile solutions of the conjugate are usually prepared, and the pH of the solutions arc 
suitably adjusted and buffered. For intravenous use, the total concentration of solutes 
should be controlled to render the preparation isotonic. For ocular administration, 
ointments or droppablc liquids may be delivered by ocular delivery systems known to the 
5 art such as applicators or eye droppers. Such compositions can include mucomimetics 
such as hyaluronic acid T chondroitin sulfate, hydroxypropyl methylcellulose or polyvinyl 
alcohol), preservatives such as sorbic acid, EDTA or benzylchronium chloride, and the 
usual quantities of diluents and/or carriers. For pulmonary administration, diluents and/or 
carriers will be selected to be appropriate to allow the formation of an aerosol. 

10 Generally, the nucleic acid compositions will be administered in an effective 

amount. For pharmaceutical uses, an effective amount is an amount effective to either 
(1) reduce the symptoms of the indication sought to be treated or (2) induce a 
pharmacological change relevant to treating or preventing the indication sought to be 
treated. > 

15 For viral gene therapy vectors, dosages will generally be from about I fig to 

about 1 mg of nucleic acid per kg of body mass. For non-infective gene therapy vectors, 
dosages will generally be from about 1 Mg to about 100 mg of nucleic acid per kg of 
body mass. Antisense oligonucleotide dosages will generally be from about 1 Mg to about 
100 mg of nucleic acid per kg of body mass. 

20 Autoimmune Disorders 

Autoimmune disorders whereby antibodies are produced against glycine 
transporters can be expected to be associated with disease states. For example, for the 
GlyT-2 transporters, such disorders can be expected to be associated with decreased 

muscle~activityr forinstance decreased muscle activity thaTcTimcaliy presents muchTikc 

25 myasthenia gravis, or to be associated with decreased pain perception. Sec, for an 
example of a disease caused by autoantibodies to a molecule involved in 
neurotransmission (glutamic acid decarboxylase), Nathan et al.,./. NeuroscL Res. 40: 134- 
137, 1995. 

The presence of these antibodies can be measured by established 
30 immunological methods using protein sequences obtained from the nucleic acids 

described herein or the related glycine transporters reported elsewhere. See. for example, 
Kim et al., Mol Pharmacol, 45: 608-617, 1994 and Liu et al., /. Biol. Chem. 268: 
22802-22808, 1992. Such immunological methods are described, for example, in Ausubel 
et al., Short Protocols In Molecular Biology, John Wiley & Sons, New York, 1992. 
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The following examples further illustrate the present invention, but of course, 
should not be construed as in any way limiting its scope. 
Example 1A - GlvT-2 Cloninn 

The cDNA encoding human GlyT-2 was generated by Reverse-Transcription 
5 PCR (RT-PCR) in two steps. In the first step, a degenerate primer corresponding to the 
rat GlyT-2 nucleotide sequence from 2540 to 2521 (5'-GGRTCDATCATRTTYTTRTA) 
was used to prime cDNA synthesis from human spinal cord poly A mRNA (Clontcch, 
Palo Alto, CA). The numbering recited herein for the rat sequence is according to the 
numbering reported in Liu et al., J. Biol. Chem. 268: 22802-22X08, 1992. The following 
1 0 primer pairs were then used in PCR reactions. 

Primer A 1 : 5'-CCNA ARGARATG AA Y AARCCNCC 

(SEQ ID NO:37; based on NT 223-245 of rat sequence) . 

Primer A2: 5'-GCNGTGAAGTACACCACTTTNCC 

(SEQ ID NO:38; based on NT 1490-1468 of rat sequence) 
15 Primer Bl: 5'-CCNAARGARATGAAYAARCCNCC 

(SEQ ID NO:39; based on NT 223-245 of rat sequence; same 
primer as Primer Al) 

Primer B2: 5'-GGCYTCNGGGTAARCCACRAANGC 

(SEQ ID NO:40; based on NT 1 872-1 R49 of rat sequence) 
The designation "R" indicates that the oligonucleotide composition has a mixture of 
adenosine and guanosine at the indicated position: "N" is for mixed oligonucleotides with 
ail four base^combinations at the indicated position; "Y" is for mixtures of cytosine and 
thymidine; "K" is for mixtures of guanosine and thymidine; "D" is for mixtures of 
adenosine r guanosine-and-thymidine. 
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25 The fragments generated by the Al + A2 primers and by the Bl + B2 

primers were separately cloned into pCRscript (Stratagene, La Jolla, CA) or pCR2.1 
(Invitrogen, San Diego, CA), and sequenced from the resulting clones using the 
AutoRead sequencing kit (Pharmacia, Piscataway, NJ). Comparison of these sequences 
to rat GlyT-2 using the Lipman-Pearson FASTA algorithm revealed a 89% identity, 

30 confirming that these sequences encoded human GlyT-2. The Al + A2 primer pair 
produced clone phG2-l, which has the nucleic acid sequence of SEQ ID N0.5 as its 
insert. The Bl + B2 primer pair produced clone phG2-2, which has the nucleic acid 
sequence of SEQ ID NO: 7 as its insert. 

For the second step, cDNA was synthesized from human spinal cord or 
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ccrebellum mRNA (Clontech, Palo Alto. CA) using random hcxamcrs (Promega. 
Madison, WI), and additional primers were designed based upon the sequence of clones 
phG2-l and phG2-2 for PCR. The following primer pairs were used to amplify the 5' 
and 3' ends of the human GlyT-2 cDNA. 
5 Primer CI: 5'-CGGTTCAATCTGTTGTCCGCATCAGACATG 

(SEQ ID NO:41; based on NT 181-210 of rat sequence) 
Primer C2: 5'-GCAGGCTCGCGCGTCCGCTG 

(SEQ ID NO:42; based on NT 210-191 of human sequence) 
Primer Dl: 5-CCCGTATGTCGTACTCGTGATCCTCCTCATCCG 
10 (SEQ ID NO:43; based on NT 1284-1316 of human sequence) 

Primer D2: 5'-CCNCCRTGNGTDATCATNGGRAANCCC 

(SEQ ID NO:44; based on NT 2087-2061 of rat sequence) 
Primer El: 5'-CCCGTATGTCGTACTCGTGATCCTCCTCATCCG 

(SEQ ID NO:43; based on NT 1284-1316 of human sequence) 
15 Primer E2: 5'-CCATCCACACTACTGGAYYARCAYTGNGTNCC 

(SEQ ID NO:45: based on NT 2624-2593 of rat sequence) 
Primer Fl: 5'-CAGATTTCCTTCTCTTTATCTGCTGCATGG 

(SEQ ID NO:46; based on NT 1417-1446 of human sequence) 
Primer F2: 5'-GGRTCDATCATRTTYTTRTANCKYTCNCC 
20 (SEQ ID NO:47; based on NT 2540-2512 of rat sequence) 

Primer G 1 : 5-CCTGCACCAACAGTGCCACAAGC 

(SEQ ID N^O:48: based on NT 15 17-1539 of human sequence) _. 
Primer G2: 5-CCATCCACACTACTGGAYYARCAYTGNGTNCC 

(SEQnDnTO74"5~based on NT 2624-2593 of rat sequence) 
25 Primer HI: 5'-CCAAGTACCTACGCACACACAAGCC 

(SEQ ID NO:49; based on NT 1784-1808 of human sequence) 
Primer H2: 5-GGATTAATACGGGACCATCCACACTACT 

(SEQ ID NO:50; based on NT 2638-261 1 of rat sequence) 
The CI + C2 primer pair produced clones phG2-3-a and phG2-3-b which 
30 have the nucleic acid sequences of SEQ IDs 1 and 3 as their inserts, respectively. The 
Dl + D2 primer pair produced phG2-4-a and phGH2-4-b which have the nucleic acid 
sequences of SEQ IDs 10 and 12 as th ir inserts, respectively. The El + E2 primer pair 
produced a clone which is believed to encompass nucleotides 1317-2379. The Fl + F2 
primer pair produced a clone which is believed to encompass nucleotides 1447-2298. 
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The Gl + G2 primer pair produced clone phG2-7-a 7 which has the nucleic acid sequence 
of SEQ ID NO: 14 as its insert and clone phG2-7-b ; which has the nucleic acid sequence 
of SEQ ID NO: 16 as its insert. The HI + H2 primer pair produced phG2-8-a and 
phGH2-8-b which have the nucleic acid sequences of SEQ IDs 22 and 24 as their inserts. 
5 respectively. 

The PCR fragments were cloned into pCR2.l (Invitrogen). Figure 1 shows 
the location of each of the cloned cDNAs in relation to the entire human GlyT-2 
sequence. Clone phG2-3 and phG2-8b were obtained from human cerebellum mRNA 
while the rest were from spinal cord. The cDNA inserts were sequenced using the 
10 AutoRead sequencing kit (Pharmacia) and the ALFexpress™ automatic sequencing 

apparatus (Pharmacia). These sequences implied ten point variations in the amino acid 
sequence. Comparison of the human GlyT-2. DNA sequence of SEQ ID NO: 1 8 to the rat 
GlyT-2 sequence revealed an 89% nucleic acid identity and a 94.4% amino acid identity 
using the FASTA algorithm. 
15 Example 1A - Further GlvT-2 Cloning 

The following primers were also employed: 
Primer II: 5'-AGCTCTGCGGGACTTGAGAG 

(SEQ ID NO:51; based on NT 276-295 of human sequence) 
Primer 12: 5-GTACACCACTTTTCCTGAAGTCTTG 

(SEQ ID NO:52; based on NT1245-1269 of human sequence) 
Primer Jl : 5'-AGCTCTGCGGGACTTGAGAG 

(SEQ ID NO:51 : base d on NT 276-295 of human sequence) 
Primer Jl: 5'-CCTTGGTCTGCCACATTCTCAATGTTG 

(SEQ~ID-NO:53r-based-on-NT-l 599-1 625 of human sequence) 

The II + 12 primer pair produced clones phG2-9-a, phG2-9-b and phG2-9-c which have 
the nucleic acid sequences of SEQ ID NOS:29, 31 and 33 as their inserts, respectively. 
The Jl + J2 primer pair produced clone phG2-10 which has the nucleic acid sequence of 
SEQ ID NO:35. 

Example 2 - Full-length Clone 

The human GlyT-2 cDNAs were then used to construct a full length human 
GlyT-2 coding sequence, which was cloned into the pcDNA3 vector (Invitrogen). The 
clone incorporated the nucleic acid sequence of SEQ ID NO:20 and was denoted 
pHGT2-a. The 5' end of the cDNA was constructed by inserting the 254 bp Hind III- 
Nar I fragment from clone phG2-3 into clone phG2-l, previously digested with Hind III 
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aiid Nar I. The 3' end of the cDNA was constructed be inserting the Hind 111-Hinc II 
fragment from phG2-2 and the Hinc II-Xba I fragment from clone phG2-7 into the 
pcDNA3 vector previously digested with Hind III and Xba I. Lastly, the Hind III-Nru I 
fragment from the 5* end clone and the Nru I-Xba 1 fragment from the 3' end clone were 
5 cloned into the pcDNA3 vector (Invitrogen) digested with Hind 111 and Xba I. The 

pHGT2-a expression clone thus obtained contains the sequence of human GlyT-2 from I 
to 2397 under the control of the human cytomegalovirus (CMV) promoter. In this 
expression clone, nts 1-173 were derived from clone phG2-3; nts 174-823 were derived 
from clone phG2-J: nts 824-1599 were derived from clone phG2-2; and nts 1600-2397 

10 were derived from clone phG2-7 (see fig. 2). 
Example 3A - Second Full-Length Clone 

An expression clone containing the nucleic acid sequence of SEQ ID NO:18 
is constructed from the expression clone containing SEQ ID NO;20 by site-directed 
mutagenesis to change NT 304 from G to A, NT 371 from T to C. NT 836 from A to T. 

15 NT 1116 from G to A ? NT 1831 from G to A, NT 2382 from T to C ; NT 2388 from A 
to G. NT 2391 from T to C and NT 2394 from A to G. The mutagenesis is conducted 
by the oligonucleotide-directed methodology described by Ausubcl ct al. Current 
Protocols in Molecular Biology. John Wiley and Sons. New York, 1995. pp.8. 1.1-8. 1.6. 
Example 3B - Third Full-Length Clone 

20 The human GlyT-2 cDNAs were used to construct another full-length GlyT-2 

coding sequence, which was cloned into the pcDNA3 vector (Invitrogen). The clone, 
denoted pHGT2-b, incorporated the nucleic acid sequence of SEQ ID NO:28 and encoded 
SEQ ID NO:27. First, a 254 bp Hindlll-Narl fragment from phG2-3a (SEQ ID NO:l) 
was ^nsefted into clone phG2-2 (SEQ ID NO:7) which had prcviously'been digested with 

25 HindHI-NarL creating Intermediate 1. A 1.6 kb HindUI-HincII fragment from 

Intermediate 1 and an 800 bp Hincll-Xbal fragment from clone phG2-7b were Hgated 
into pcDNA that had been digested with Hindlll-Xbal, creating Intermediate 2. 

A Ndel-MscI fragment (1 kb) and a Bsml-Ndcl fragment (6.9 kb ? containing 
pcDNA3) from Intermediate 2 were ligatcd with a 434 bp MscI-BsmI fragment from 

30 phG2-l (SEQ ID NO:5), creating Intermediate 3. A 3.8 kb BssHII fragment from 
Intermediate 3 was ligated with a 4.0 kp BssHII fragment of clone pHGT2-a (see 
Example 2), creating pHGT2-b. In pHGT2-b, nts 1-173 were derived from clone phG2- 
3a (SEQ ID NO:l), nts 174-523 and 962-1599 from clone phG2-2 (SEQ ID NO:7), nts 
524-961 from clone phG2-l (SEQ ID NO:5), and nts 1600-2397 from clone phG2-7b 
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(SEQ ID NO: 16). 

Example 4 - GlvT-2 Expression 

The clones of examples 2 and 3B were transfected into QT-6 cells (from 
American Type Culture Collection, Accession No. ATCC CRL-1708) using the method 
described in Example 5. The glycine transport assay described in Example 6 was used to 
confirm that glycine transport activity was conferred to the cells by the transfection. 
Example 5 - Transfection 

This example sets forth methods and materials used for growing and 
transfecting QT-6 cells, which arc avian fibroblasts derived from quail. Transfcctions 
with pHGT2-a have been conducted, as have transfections with GIyT-1 vectors, though 
these latter transfections were conducted at separate times. 

QT-6 cells were obtained from American Type Culture Collection (Accession 

No. ATCC CRL-1708). Complete QT-6 medium for growing QT-6 was Medium 199 
(Sigma Chemical Company, St. Louis. MO; hereinafter "Sigma") supplemented to be 
10% tryptosc phosphate; 5% fetal bovine serum (Sigma); 1% penicillin-streptomycin 
(Sigma); and 1% sterile dimethylsulfoxide (DMSO; Sigma)7 Other solutions required for 
growing or transfecting QT-6 cells included: 

DNA/DEAE Mix: 450 M l TBS, 450 M l DEAE Dextran (Sigma), and 100 „1 
of DNA (4 M g) in TE, where the DNA included GlyT-la, GlyT-lb, GlyT-lc, or GlyT-2 
encoding DNA, in a suitable expression vector. The DNA used was as defined below. 

PBS: Standard phosphate buffered saline, pH 7.4 including I mM CaCl> and 
1 mM MgCl 2 sterilized through a 0.2 urn filter. 

IBS: One ml of Solution B, 10 ml of Solution A: brought to 100 ml with 

-distilled-H 2 0;-filter-sterilized and stored at 4°Cr 

TE: 0.01 M Tris, 0.001 M EDTA, pH 8.0. 

DEAE dextran: Sigma, #D-9885. A stock solution was prepared consisting 
of 0.1% (1 mg /ral) of the DEAE dextran in TBS. The stock solution was filter sterilized 
and frozen in 1 ml aliquots. 

Chloroquine: Sigma, #C-6628. A stock solution was prepared consisting of 
100 mM chloroquine in H 2 0. The stock solution was filler-sterilized and stored in 0.5 ml 
aliquots, frozen. 
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Solution A (IPX) : 

NaC! 8.00 g 

KC1 0.38 g 

Na 2 HP0 4 0.20 g 
5 Tris base 3.00 g 

The solution was adjusted to pH 7.5 with HCL brought to 1 00.0 ml with distilled H 2 0, 
and filter-sterilized and stored at room temperature. 
Solution B (10QX) : 

CaCl 2 -2H 2 0 1.5 g 

10 MgCl 2 -6H 2 0 1.0 g 

The solution was brought to 100 ml with distilled H 2 0. and filter-sterilized: the solution 
was then stored at room temperature. 

HBSS : 150 mM NaCL 20 mM HEPES, 1 mM CaCl 2 , 10 mM glucose. 5 
mM KG, 1 mM MgCl 2 -H 2 0; adjusted with NaOH to pH 7.4. 
15 Standard growth and passaging procedures used were as follows: Cells were 

grown in 225 ml flasks. For passaging, cells were washed twice with warm HBSS (5 ml 
each wash). Two ml of a 0.05% trypsin/EDTA solution was added, the culture was 
swirled, then the trypsin/EDTA solution was aspirated quickly. The culture was then 
incubated about 2 minutes (until ceils lift off), then 10 ml of QT-6 media was added and 
20 the cells are further dislodged by swirling the flask and tapping its bottom. The cells 
were removed and transferred to a 15 ml conical tube, centrifuged at 1000 xg for 10 

minutes, and resuspended in 10 ml of QT-6 medium. A sample was removed for __ 

counting, the cells were then diluted further to a concentration of 1 x 10 s cells/ml using 
QT-6 mediumi-and-65-ml~of the"culture was added per 225 ml flask "of passaged cc\\s. 
25 Transfcction was accomplished using cDNAs prepared as follows: 

For human GlyT-2 expression, the pHGT2-a clone described above was used. 
The human GlyT-la (hGlyT-la) clone contained the sequence of hGlyT-la 
from nucleotide position 183 to 2108 cloned into the pRc/CMV vector (lnvitrogen. San 
Diego, CA) as a Hind Ill-Xba I fragment as described in Kim et al., Mol Pharmacol, 
30 45: 608-617, 1994, The first 17 nucleotides (corresponding to the first 6 amino acids) of 
the GlyT-la sequence reported in this Kim ct al. article is actually based on the rat 
sequence. To determine whether the sequence of human GlyT-la is different in this 
region, the 5' region of hGlyT-la from nucleotide 1 to 212 was obtained by rapid 
amplification of cDNA ends using the 5* RACE system supplied by Gibco BRL 
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(Gaithcrsburg, MD). Sequencing of this 5" region of GlyT-la confirmed that the first 17 
nucleotides of coding sequence are identical in human and rat GlyT-la. 

The human GlyT-lb (hGlyT-lb) clone contained the sequence of hGlyT-lb 
from nucleotide position 213 to 2274 cloned into the pRc/CMV vector as a Hind III - 
Xba I fragment as described in Kim et al., supra. 

The human GlyT-lc (hGlyT-lc) clone contained the sequence of hGlyT-lc 
from nucleotide position 213 to 2336 cloned into the pRc/CMV vector (Invitrogcn) as a 
Hind III - Xba I fragment as described in Kim et al., supra. The Hind III - Xba 
fragment of hGlyT-lc from this clone was subcloned into the p,Rc/RSV vector. 
Transfection experiments were performed with GlyT-lc in both the pRc/RSV and 
pRc/CMV expression vectors. 

The following four day procedure for the tranfections was used: 
On day I, QT-6 cells were plated at a density of 1 x 10 ft cells in 10 ml of 
complete QT-6 medium in 100 mm dishes. 

On day 2, the medium was aspirated and the cells were washed with 10 ml 
of PBS followed by 10 ml of TBS. The TBS was aspirated, then I ml of the 
DEAE/DNA mix was added to the plate. The plate was swirled in the hood every 5 
minutes. After 30 minutes, 8 ml of 80 uM chloroquine in QT-6 medium was added and 
the culture was incubated for 2.5 hours at 37°C and 5% C0 2 . The medium was then 
aspirated and the cells were washed two times with complete QT-6 medium, then 100 ml 
complete QT-6 medium was added and the cells were returned to the incubator. 

On day 3, the cells were removed with trypsin/EDTA as described above, 
and plated into the wells of 96-well assay plates at approximately 2xl() 5 cclls/welL 

On-day-4 i -glycine-transport-was assayed as described in Example^: 

25 Example 6 - Glycine Untuke 

This example illustrates a method for the measurement of glycine uptake by 
transfected cultured cells. 

Transient GlyT-transfected cells or control cells grown in accordance with 
Example 5 were washed three times with HEPES buffered saline (HBS). The control 
cells were treated precisely as the GlyT-transfected cells except that the transfection 
procedure omitted any cDNA. The cells were incubated 10 minutes at 37»C, after which 
a solution was added containing 50 nM [fy glycine (17.5 Ci/mmol) and cither (a) no 
potential competitor, (b) 10 nM nonradioactive glycine or (c) a concentration of a 
prospective agent. A range of concentrations of the prospective agent was used to . 



20 



30 
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generate data for calculating the concentration resulting in 50% of the effect (for 
example, the IC 50 s, which are the concentrations of agent inhibiting glycine uptake by 
50%). The cells were then incubated another 20 minutes at 37°C. after which the cells 
were washed three times with ice-cold HBS. Scintillant was added to the cells, the cells 

5 were shaken for 30 minutes, and the radioactivity in the cells was counted using a 

scintillation counter. Data were compared between the cells contacted or not contacted 
by a prospective agent, and. where relevant, between cells having GlyT-1 activity versus 
cells having GlyT-2 activity, depending on the assay being conducted. 

Expression of glycine transporter activity in QT-6 cells transfected with the 

10 human GlyT-2 clone. pHGT2-a, is demonstrated in Figure 5. in which | 3 H| glycine 

uptake is shown for mock and pHGT2-a transfected cells. QT-6 cells transfected with 
pHGT2-a show significant increases in glycine transport as compared to mock transfected 
control cells. The results arc presented as means ± SEM of a representative experiment 
performed in triplicate. Substantially similar results were obtained with pHGT2-b. 

15 The concentration dependence of glycine transport in pHGT2-a-transfected 

cells is shown in Figure 6; Substantially similar results were obtained with pHGT2-b. 
QT-6 cells transfected with the human GlyT-2 were incubated with 50 nM ( 3 HJ glycine 
and the indicated concentrations of unlabeled glycine for 20 minutes, and the cell- 
incorporated radioactivity was determined by scintillation counting. Data points represent 

20 means ± SEM from an experiment performed in quadruplicate. The results indicated an 
1C 50 of 40 uM. 

Example 7 - Calcium Flux 

This example illustrates a protocol for measuring calcium flux in cells. 

The calcium flux "measurement was gcnerally^erTormed in primary cell 

25 cultures, which were prepared using standard procedures and techniques that require 

sterile dissecting equipment, a microscope and defined medium. The protocol used was 
substantially as described by Lu et aL Proc\ Natl Acad ScL USA, 88: 6289-6292, 1991. 
Example 8 - Binding to Strychnine-Sensitive Receptor 

Binding of strychnine to strychnine-sensitive receptors was measured as 
30 described in White et al. J. Nenrochem. 35: 503-512. 1989 and Becker et aL »/. Neurosci. 
6: 1358-1364, 1986, with minor modifications. 

The nucleic acid (N.A.) r amino acid sequences referred to herein by SEQ 
ID NO: are as follows: 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANT: Albert, Vivian 

(ii) TITLE OF THE INVENTION: Human Glycine Trail 

(iii) NUMBER OF SEQUENCES: 53 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dechert Price & Rhoads 

{B) STREET: 997 Lenox Drive, Building 3, Suit 

(C) CITY: Lawrenceville 

(D) STATE: NJ 

(E) COUNTRY: USA 

(F) ZIP: 08543 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(DJ SOFTWARE: FastSEQ for Windows Version 2 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Bloom, Allen 

(B) REGISTRATION NUMBER: 29,135 

(C) REFERENCE/DOCKET NUMBER: 317743-108WO 
Ux) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 609-520-3214 

(B) TELEFAX: 609-520-3259 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID NO:l: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 190 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



(2) INFORMATION FOR SEQ TD~N0T2l ~~ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Asp Cys Ser Ala Pro Lys Glu Met Asn Lys Leu Pro Ala Asn Ser 

1.5 io 15 

Pro Glu Ala Ala Ala Ala Gin Gly His Pro Asp Gly Pro Cys Ala Pro 

20 25 3q 

Arg Thr Ser Pro Glu Gin Glu Leu Pro Ala Ala Ala Ala Pro Pro Pro 

35 40 45 

Pro Arg Val Pro Arg Ser Ala Ser Thr Gly Ala Gin Thr Phe Gin 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 3: 
U) SEQUENCE CHARACTERISTICS: 




60 
120 
180 
190 
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(A) LENGTH : 190 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGGACTGCA GTGCTCCCAA GGAAATGAAT AAACTGCCAG CCAACAGCCC GGAGGCGGCG 60 

GCGGCGCAGG GCCACCCGGA TGGCCCATGC GCTCCCAGGA CGAGCCCGGA GCAGGAGCTT 120 

CCCGCGGCTG CCGCCCCGCC GCCGCCACGT GTGCCCAGGT CCGCTTCCAC CGGCGCCCAA 180 

ACTTTCCAGT 190 

(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Asp Cys Ser Ala Pro Lys Glu Met Asn Lys Leu Pro Ala Asn Ser 

15 10 15 

Pro Glu Ala Ala Ala Ala Gin Gly His Pro Asp Gly Pro Cys Ala Pro 

20 25 30 

Arg Thr Ser Pro Glu Gin Glu Leu Pro Ala Ala Ala Ala Pro Pro Pro 

35 40 45 

Pro Arg Val Pro Arg Ser Ala Ser Thr Gly Ala Gin Thr Phe Gin 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1216 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AGCCAACAGC CCGGAGGCGG CGGCGGCGCA GGGCCACCCG GATGGCCCAT GCGCTCCCAG 
GACGAGCCCG GAGCAGGAGC TTCCCGCGGC TGCCGCCCCG CCGCCGCCAC GTGTGCCCAG 
GTCCGCTTCC ACCGGCGCCC AAACTTTCCA GTCAGCGGAC GCGCGAGCCT GCGAGGCTGA 
GCGGCCAGGA GTGGGGTCTT GCAAACTCAG TAGCCCGCGG GCGCAGGCGG CCTCTGCAGC 
TCTGCGGGAC TTGAGAGAGG CGCAAGGCGC GCAGGCCTCG CCCCCTCCCG GGAGCTCCGG 
GCCCGGCAAC GCGCTGCACT GTAAGATCCC TTTTCTGCGA GGCCCGGAGG GGGATGCGAA 
CGTGAGTGTG GGCAAGGGCACCCTG GAG CG GAACAATACC CCTGTTGTGG GCT GGGTGAA 
CATGAGCCAG AGCACCGTGG TGCTGGGCAC GGATGGAATC ACGTCCGTGC TCCCGGGCAG 
CGTGG CCACC GTTGCCACCC A GGAGGACGA GCAAGGGGAT GAGAATAAG_G_X.CCGAG GGAA 
~ CTGGTCCAGC AAACTGGACT TCATCCTGTC CATGGTGGGG TACGCAGTGG GGCTGGGCAA 
TGTCTGGAGG TTTCCCTACC TGGCCTTCCA GAACGGGGGA GGTGCTTTCC TCATCCCTTA 
CCTGATGATG CTGGCTCTGG CTGGATTACC CATCTTCTTC TTGGAGGTGT CGCTGGGCCA 
GTTTGCCAGC CAGGGACCAG TGTCTGTGTG GAAGGCCATC CCAGCTCTAC AAGGCTGTGG 
CATCGCGATG CTGATCATCT CTGTCCTAAT AGCCATATAC TACAATGTGA TTATTTGCTA 
TACACTTTTC TACCTGTTTG CCTCCTTTGT GTCTGTACTA CCCTGGGGCT CCTGCAACAA 
CCCTTGGAAT ACGCCAGAAT GCAAAGATAA AACCAAACTT TTATTAGATT CCTGTGTTAT 
CAGTGACCAT CCCAAAATAC AGATCAAGAA CTCGACTTTC TGCATGACCG CTTATCCCAA 
CGTGACAATG GTTAATTTCA CCAGCCAGGC CAATAAGACA TTTGTCAGTG GAAGTGAAGA 
GTACTTCAAG TACTTTGTGC TGAAGATTTC TGCAGGGATT GAATATCCTG GCGAGATCGG 
GTGGCCACTA GCTCTCTGCC TCTTCCTGGC TTGGGTCATT GTGTATGCAT CGTTGGCTAA 
AGGAATCAAG ACTTCA 

(2) INFORMATION FOR SEQ ID NO:6: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 405 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



60 
120 
180 
240 
300 
360 
420 
480 
-540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1216 
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Ala Asn Ser Pro Glu Ala Ala Ala Ala Gin Gly His Pro Asp Gly Pro 

1 5 10 15 

Cys Ala Pro Arg Thr Ser Pro Glu Gin Glu Leu Pro Ala Ala Ala Ala 

20 25 30 

Pro Pro Pro Pro Arg Val Pro Arg Ser Ala Ser Thr Gly Ala Gin Thr 

35 40 45 

Phe Gin Ser Ala Asp Ala Arg Ala Cys Glu Ala Glu Arg Pro Gly Val 

55 60 
Gly Ser Cys Lys Leu Ser Ser Pro Arg Ala Gin Ala Ala Ser Ala Ala 

70 75 bo 

Leu Arg Asp Leu Arg Glu Ala Gin Gly Ala Gin Ala Ser Pro Pro Pro 

85 90 9 c 

Gly Ser Ser Gly Pro Gly Asn Ala Leu His Cys Lys lie Pro Phe Leu 

0 105 110 

Arg Gly Pro Glu Gly Asp Ala Asn Val Ser Val Gly Lys Gly Thr Leu 

115 120 125 

Glu Arg Asn Asn Thr Pro Val Val Gly Trp Val Asn Met Ser Gin Ser 

Thr Val Val Leu Gly Thr Asp Gly lie Thr Ser Val Leu Pro Gly Ser 

150 155 ien 

Val Ala Thr Val Ala Thr Gin Glu Asp Glu Gin Gly Asp Glu Asn Lys 

, , 165 170 175 

Ala Arg Gly Asn Trp Ser Ser Lys Leu Asp Phe lie Leu Ser Met Val 

180 I 85 190 

Gly Tyr Ala Val Gly Leu Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala 

, 195 200 205 

Phe Gin Asn Gly Gly Gly Ala Phe Leu lie Pro Tyr Leu Met Met Leu 

215 220 
Ala Leu Ala Gly Leu Pro He Phe Phe Leu Glu Val Ser Leu Gly Gin 

230 235 240 

Phe Ala Ser Gin Gly Pro Val Ser Val Trp Lys Ala lie Pro Ala Leu 

245 250 255 

Gin Gly Cys Gly He Ala Met Leu lie lie Ser Val Leu lie Ala lie 

260 26 5 270 

Tyr Tyr Asn Val He He Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser 

*' 5 280 285 

Phe Val Ser Val Leu Pro Trp Gly Ser Cys Asn Asn Pro Trp Asn Thr 

" u 295 300 

Pro Glu Cys Lys Asp Lys Thr Lys Leu Leu Leu Asp Ser Cys Val He 

Ser Asp His Pro Lys He Gin lie Lys Asn Ser Thr Phe Cys Met Thr 

», „ 325 330 335 

Ala Tyr Pro -Asn_Val-Thr_Met -Val Asn Phe Thr Ser Gin Ala-Asn -Lys- 

345 350 

Thr Phe Val Ser Gly Ser Glu Glu Tyr Phe Lys Tyr Phe Val Leu Lys 
355 . 3 6{) jr 



He Ser Ala Gly lie Glu Tyr Pro Gly Glu He Gly Trp Pro Leu Ala 

375 39o 
Leu Cys Leu Phe Leu Ala Trp Val lie Val Tyr Ala Ser Leu Ala Lys 

Gly He Lys Thr Ser 395 400 

405 

(2) INFORMATION FOR SEQ ID NO:7- 
(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1597 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

AGCCAACAGC CCGGAGGCGG CGGCGGCGCA GGGCCACCCG GATGGCCCAT rrrrrrrr^r 

GACGAGCCCG GAGCAGGAGC TTCCCGCGGC TGCCGCCCCG CCGCCGcSc Irtrlccr^ , f n 

GTCCGCTTCC ACCGGCGCCC AAACTTTCCA GTCAGCGGAC GCGCGAgSJ SSggSX i 

GCGGCCAGGA GTGGGGTCTT GCAAACTCAG TAGCCCGCGG G^SSgCGG SctgSS "S 
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TCTGCGGGAC TTGAGAGAGG CGCAAAGCGC GCAGGCCTCG CCCCCTCCCG GGAGCTCCGG 300 

GCCCGGCAAC GCGCTGCACT GTAAGATCCC TTCTCTGCGA GGCCCGGAGG GGGATGCGAA 360 

CGTGAGTGTG GGCAAGGGCA CCCTGGAGCG GAACAATACC CCTGTTGTGG GCTGGGTGAA 420 

CATGAGCCAG AGCACCGTGG TGCTGGGCAC GGATGGAATC ACGTCCGTGC TCCCGGGCAG 4 80 

CGTGGCCACC GTTGCCACCC AGGAGGACGA GCAAGGGGAT GAGAATAAGG CCTGAGGGAA 540 

CTGGTCCAGC AAACTGGACT TCATCCTGTC CATGGTGGGG TACGCAGTGG GGCTGGGCAA 600 

TGTCTGGAGG TTTCCCTACC TGGCCTTCCA GAACGGGGGA GGTGCTTTCC TCATCCCTTA 660 

CCTGATGATG CTGGCTCTGG CTGGATTACC CATCTTCTTC TTGGAGGTGT CGCTGGGCCA 720 

GTTTGCCAGC CAGGGACCAG TGTCTGTGTG GAAGGCCATC CCAGCTCTAC AAGGCTGTGG 780 

CATCGCGATG CTGATCAACT CTGTCCTAAT AG C CAT AT AC TACAATGTGA TTATTTGCTA 840 

TACACTTTTC TACCTGTTTG CCTCCTTTGT GTCTGTACTA CCCTGGGGCT CCTGCAACAA 900 

CCCTTGGAAT ACGCCAGAAT GCAAAGATAA AACCAAACTT TTATTAGATT CCTGTGTTAT 960 

CAGTGACCAT CCCAAAATAC AGATCAAGAA CTCGACTTTC TGCATGACCG CTTATCCCAA 1020 

CGTGACAATG GTTAATTTCA CCAGCCAGGC CAATAAGACA TTTGTCAGTG GAAGTGAGGA 1080 

GTACTTCAAG TACTTTGTGC TGAAGATTTC TGCAGGGATT GAATATCCTG GCGAGATCAG 1140 

GTGGCCACTA GCTCTCTGCC TCTTCCTGGC TTGGGTCATT GTGTATGCAT CGTTGGCTAA 1200 

AGGAATCAAG ACTTCAGGAA AAGTGGTGTA CTTCACGGCC ACGTTCCCGT ATGTCGTACT 1260 

CGTGATCCTC CTCATCCGAG GAGTCACCCT GCCTGGAGCT GGAGCTGGGA TCTGGTACTT 1320 

CATCACACCC AAGTGGGAGA AACTCACGGA TGCCACGGTG TGGAAAGATG CTGCCACTCA 1380 

GATTTTCTTC TCTTTATCTG CTGCATGGGG AGGCCTGATC ACTCTCTCTT CTTACAACAA 14 4 0 

ATTCCACAAC AACTGCTACA GGGACACTCT AATTGTCACC TGCACCAACA GTGCCACAAG 1500 

CATCTTTGCC GGCTTCGTCA TCTTCTCCGT TATCGGCTTC ATGGCCAATG AACGCAAAGT 1560 

CAACATTGAG AATGTGGCAG ACCAAGGGCC AGGCATT 1597 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 177 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear" 





(xi) SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 8: 










Ala 


Asn Ser 


Pro 


Glu 


Ala Ala 


Ala Ala 


Gin 


Gly 


His 


Pro Asp 


Gly 


Pro 


1 






5 






10 










15 




Cys 


Ala Pro 


Arg 


Thr 


Ser Pro 


Glu Gin 


Glu 


Leu 


Pro 


Ala 


Ala 


Ala 


Ala 




20 






25 










30 






Pro 


Pro Pro 


Pro 


Arg 


Val Pro 


Arg Ser 


Ala 


Ser 


Thr 


Gly Ala 


Gin 


Thr 




35 






40 








45 








Phe 


Gin Ser 


Ala 


Asp 


Ala Arg 


Ala Cys 


Glu 


Ala 


Glu 


Arg 


Pro 


Gly Val 




50 






55 








60 








Gly 


Ser Cys 


Lys 


Leu 


Ser Ser 


Pro Arg 


Ala 


Gin 


Ala 


Ala 


Ser 


Ala 


Ala 


65 




70 






75 










80 


Leu 


Arg Asp 


Leu 


Arg 


Glu Ala- 


GlnSer 


Ala 


Gin 


Ala 


Ser 


Pro 


Pro 


Pro 






85 






90 










95 




Gly 


Ser Ser 


Gly 


Pro 


Gly Asn 


Ala Leu 


His 


Cys 


Lys 


He 


Pro 


Ser 


Leu 






100 






10b 










110 






Arg 


Gly Pro 


Glu 


Gly Asp Ala Asn Val 


Ser 


Val 


Gly Lys 


Gly Thr 


Leu 




115 








120 








125 








Glu 


Arg Asn 


Asn 


Thr 


Pro Val 


Val Gly 


Trp Val Asn Met 


Ser 


Gin 


Ser 




130 






135 








140 










Thr 


Val Val 


Leu 


Gly Thr Asp 


Gly He 


Thr 


Ser 


Val 


Leu 


Pro 


Gly 


Ser 


145 








150 






155 








160 


Val 


Ala Thr 


Val 


Ala 


Thr Gin 


Glu Asp 


Glu 


Gin 


Gly Asp Glu Asn 


Lys 








165 






170 










175 


Ala 



























(2) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 354 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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Gly Asn Trp Ser Ser Lys Leu Asp Phe lie Leu Ser Met Val Gly Tyr 
Ala Val Gly Leu Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala Phe Gin 
Asn Gly Gly Gly Ala Phe Leu lie Pro Tyr Leu Met Met Leu Ala Leu 
Ala Gly Leu Pro lie Phe Phe Leu Glu Val Ser Leu Gly Gin Phe Ala 
Ser Gin Gly Pro Val Ser Val Trp Lys Ala lie tro Ala Leu Gin Gly 

Cys Gly He Ala Met Leu lie Asn Ser Val Leu lie Ala lie Tyr Tyr 

" 90 qi 

Asn Val He lie Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser Phe Val 
Ser Val Leu Pro Trp Gly Ser Cys ls°n Asn Pro Trp Asn XS Pro Glu 
Cys Lys Asp Lys Thr Lys Leu Leu Leu Asp Ser Cys Val He Ser Asp 
His Pro Lys lie Gin lie Lys Asn Ser Thr Phe Cys Met Thr Ala Tyr 
Pro Asn Val Thr Met Val Asn Phe Thr Ser Gin Ala Asn Lys Thr 
Val Ser Gly Ser Glu Glu Tyr Phe Lys Tyr Phe Val Leu Lys He Ser 

Arg Trp Pro 

Leu Phe Leu Ala Trp Val lie Val Tyr Ala Ser Leu Ala Lys Gly lie 



Ala Gly lie Glu Tyr Pro Gly Glu lie Arg Trp Pro Leu Ala Leu Cys 



195 200 
Leu Ala Trp Val He Val 

215 

Lys Thr Ser Gly Lys Val Val Tyr Phe Thr Ala Thr Phe Pro Tyr Val 

Ala 
255 



? 30 - 235 240 

Ala 
255 
Thr 

Ala Thr Val Trp Lys Asp Ala Ala Thr Gin lie Phe Phe Ur Leu Ser 



Val Leu Val He Leu Leu lie Arg Gly Val Thr Leu Pro Gly Ala lly 

Ala Gly He Trp Tyr Phe lie Thr Pro Lys Trp Glu Lys Leu Thr Asp 

2 65 



275 280 28^ 

Ala Ala Trp Gly Gly Leu lie Thr Leu Ser Ser Tyr Asn Lys Phe His 

Asn Asn Cys Tyr Arg Asp Thr Leu lie Val Thr Cys Thr Asn Ser Ala 

Thr Ser lie Phe Ala Gly Phe Val lie Phe lit Val lie Gly Phe Me£ 

330 i}rr 

-Ala Asn Glu Arg Lys Val Asn -I-l-e-Gl-u-Asn-Val Ala Asp Gin G^y Pro 
Gly He 345 350 



. 12) INFORMATION FOR SEQ ID NO: 10- 
(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 533 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO-10- 



rn™^S CTGCCTGGAG CTGGAGCTGG GATCTGGTAC TTCATCACAC CCAACTrrr* 

S?JSSrr ^? GCCACGG TGTGGAAAGA TGCTGCCACT SSS^CT i S 

TGCTGCATGG GGAGGCCTGA TCACTCTCTr rrrTTiritKr jTVi^iii. n-it.liTATC 120 

CAGGGACACT CTAATTGTCA CCTgSccX SSSSff ^I^» ACA ACA ACTGCTA 180 

sees ss siii IS - 

sssss sags liil llii S ™sgs its 

CACACACAAG CCAGTGTTTA SSS SSESS 
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(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 177 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 





vax 




IlCU 


Prn 
r i u 


ox y 


Ala 


Glv Ala 


Glv 


He 


Trp Tyr 


Phe 


He 


Thr 


X 








5 






10 








15 






As n 


Trn 


Glu 


Lys 


Leu 


Thr 


Asp Ala 


Thr 


Val 


Trp Lys 


Asp Ala 


Ala 






20 






25 








30 






Thr 


Gin 


He 


Phe 


Phe 


Ser 


Leu 


Ser Ala 


Ala. 


Trp Gly Gly 


Leu 


He 


Thr 






35 










40 






45 








Leu 


Ser 


Ser 


Tyr 


Asn 


Lys 


Phe 


His Asn 


Asn 


Cys 


Tyr Arg 


Asp 


Thr 


Leu 




50 






55 








60 








He 


Val 


Thr 


Cys 


Thr 


Asn 


Ser 


Ala Thr 


Ser 


He 


Phe Ala 


Gly 


Phe 


Val 


65 








70 








75 








80 


He 


Phe 


Ser 


Val 


He 


Gly 


Phe 


Met Ala 


Asn 


Glu Arg Lys 


Val 


Asn 


He 










85 






90 








95 




Glu 


Asn 


Val 


Ala 


Asp 


Gin 


Gly 


Pro Gly 


He 


Ala 


Phe Val 


Val 


Tyr 


Pro 








100 






105 








110 






Glu 


Ala 


Leu 


Thr 


Arg 


Leu 


Pro 


Leu Ser 


Pro 


Phe 


Trp" Ala 


He 


He 


Phe 






115 








120 






125 








Phe 


Leu 


Met 


Leu 


Leu 


Thr 


Leu 


Gly Leu 


Asp 


Thr 


Met Phe 


Ala 


Thr 


He 




130 










135 








140 








Glu 


Thr 


He 


Val 


Thr 


Ser 


He 


Ser Asp 


Glu 


Phe 


Pro Lys 


Tyr 


Leu 


Arg 


145 










150 








155 








160 


Thr 


His 


Lys 


Pro 


Val 


Phe 


Thr 


Leu Gly 


Cys 


Cys 


He. .Cys 


Phe 


Phe 


He 








165 








170 








175 





Met 



(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 533 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

AGGAGTCACC CTGCCTGGAG CTGGAGCTGG GATCTGGTAC TTCATCACAC CCAAGTGGGA 60 

GAAACTCACG "AATGCCACGG TGTGGAAAGA TGCTGCCACT CAGATTTT CT TCTCTTTATC 120 

TGCTGCATGG GGAGGCCTGA TCACTCTCTC TTCTTACAAC AAATTCCACA ACAACTGCTA 180 

CAGGGACACT CTAATTGTCA CCTGCACCAA CAGTGCCACA A GCATCTTT G CCGGCTTCGT 240 

CATCTT CTTXHCTTAT C G GCT TCATGGCCAA TGAACGCAAA GTCAACATTG AGAATGTGGC 300 

AGACCAAGGG CCAGGCATTG CATTTGTGGT TTACCCGGAA GCCTTAACCA GGCTGCCTCT 360 

CTCTCCGTTC TGGGCCATCA TCTTTTTCCT GATGCTCCTC ACTCTTGGAC TTGACACTAT 420 

GTTTGCCACC ATCGAGACCA TAGTGACCTC CATCTCAGAC GAGTTTCCCA AGTACCTACG 4 80 

CACACACAAG CCAGTGTTTA CTCTGGGCTG CTGCATTTGT TTCTTCATCA TGG 533 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 177 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Val Thr Leu Pro Gly Ala Gly Ala Gly He Trp Tyr Phe He Thr 

15 10 15 

Pro Lys Trp Glu Lys Leu Thr Asn Ala Thr Val Trp Lys Asp Ala Ala 

20 25 30 

Thr Gin He Phe Phe Ser Leu Ser Ala Ala Trp Gly Gly Leu He Thr 
35 40 45 
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Leu 


Ser 


Ser 


Tyr 


Asn 


Lys 


Phe 


lie 


50 








55 


Val 


Thr 


Cys 


Thr 


Asn 


Ser 


65 








70 




He 


Phe 


Ser 


Val 


He 


Gly 


Phe 


Glu 








85 




Asn 


Val 


Ala 


Asp 


Gin 


Gly 








100 






Glu 


Ala 


Leu 


Thr 


Arg 


Leu 


Pro 






115 








Phe 


Leu 


Met 


Leu 


Leu 


Thr 


Leu 


Glu 


130 










135 


Thr 


He 


Val 


Thr 


Ser 


He 


145 


His 








150 


Thr 


Lys 


Pro 


Val 
165 


Phe 


Thr 


Met 















-43- 

His Asn Asn Cys Tyr Arg Asp Thr Leu 
60 

Ala Thr Ser He Phe Ala Gly Phe Val 

75 80 
Met Ala Asn Glu Arg Lys Val Asn He 

90 95 
Pro Gly He Ala Phe Val Val Tyr Pro 

105 110 
Leu Ser Pro Phe Trp Ala He He Phe 
!20 125 
Gly Leu Asp Thr Met Phe Ala Thr He 
140 

Ser Asp Glu Phe Pro Lys Tyr Leu Arg 

^ 155 160 
Leu Gly Cys Cys lie Cys Phe Phe He 
170 175 



(2) INFORMATION FOR SEQ ID NO: 14: 
U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO-14- 



GCTTCGTCAT CTTCTCCGTT ATCGGCTTCA TGGCCAATGA ACGCAAAGTC 
AACATTGAGA ATGTGGCAGA CCAAGGGCCA GGCATTGCAT TTGTGGTTTA CCCGC^rrr 
TTAACCAGGC TGCCTCTCTC TCCGTTCTGG GCCATCATCT TTTTCCTGAT GCTCPTrar^ 
CTTGGACTTG ACACTATGTT TGCCACCATC GAGACCATAG TGACCTCCAT SSSSS? 
I"^ CAAGT AC ^ACGCAC ACACAAGCCA GTGTTTACTC TGGgSgSg CGTTTGTTTr 
^™^S G GTTT TCCAAT GATCACTCAG GGTGGAATTT ACATGTTTC^ GCWctSE 
ACCTATGCTG CCTCCTATGC CCTTGTCATC ATTGCCATTT TTGA rPT r rr rrrr*ZnZ™ 

GCTTGCAAAG ATTCTGTGAA SJSSS tSSSS SwSES 
^TCTTCT GGAAAGTCTG CTGGGCATTT GTAACCCCAA CCATTTTAAC CTTtS?SJ 
l%ll^ T TTTACCAG TG GGAGCCCATG ACCTATGGCT CTTACCGCTA 
TCCATGGTGC TCGGATGGCT AATGCTCGCC TGTTCCGTCA TCTGrATrrr laSi^S^ 
rl^l^ TGCATC ^GC CCCTGGAAS EStkK TOKMOT JSgtcSS 7 , n 
ATr^rr^ ^TGGGGCCC ATTCTTAGCT CAACACCGCG GGGA^CGtS SJSKSEg ill 
ATCGACCCCT TGGGAACCTC TTCCTTGGGA CTCAAACTGC CAGTGA^G^ StoSJSSS III 



(2) -INFORMATION FOR SEQ ID NO: 15- 

(l) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 280 amino acids 
-(-B-)— TYPE-:— amino acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 15- 



He Phe Ala Gly Phe Val He Phe Ser Val He Gly Phe Met Ala Asn 

5 10 15 

Ala Phe Val Val Tyr Pro Glu Ala £eu Thr Arg Leu Pro IL Ser Pro 
*° 40 45 



Glu Arg Lys Val Asn He Glu Asn Val Ala Asp Gin Gly Pro G^y He 



Phe Trp Ala He He Phe Phe Leu Met Leu Leu Thr Leu Gly Leu Asp 
Thr Met Phe Ala Thr lie Glu Thr He Val Thr ler II* Ser Asp Glu 
Ph Pro Lys Tyr Leu Arg Thr His Lys Pro Val Phe Thr Leu Gly lys 
Cys Val Cys Phe Ph He Met Gly Phe Iro Met He Thr Gin G^y Gly 
He Tyr Met Phe Gin Leu Val Asp HI Tyr Ala Ala Ser J" Ala Leu 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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115 120 125 



Val 


He 


He Ala He Phe 


Glu 


Leu 


Val Gly 


He 


Ser 


Tyr 


Val 


Tyr 


Gly 




130 




135 








140 




Leu 


Gin 


Arg Phe Cys Glu 


Asp 


He 


Glu Met 


Met 


He 


Gly 


Phe 


Gin 


Pro 


145 




150 






155 








160 


Asn 


He 


Phe Trp Lys Val 


Cys 


Trp 


Ala Phe 


Val 


Thr 


Pro 


Thr 


He 


Leu 






165 




170 










175 




Thr 


Phe 


He Leu Cys Phe 


Ser 


Phe 


Tyr Gin 


Trp Glu 


Pro 


Met 


Thr 


Tyr 






180 






185 








190 




Gly 


Ser 


Tyr Arg Tyr Pro 


Asn 


Trp 


Ser Met 


Val 


Leu 


Gly Trp 


Leu 


Met 






195 




200 








205 








Leu 


Ala 


Cys Ser Val He 


Trp 


He 


Pro He 


Met 


Phe 


Val 


He 


Lys 


Met 




210 




215 








220 








His 


Leu 


Ala Pro Gly Arg 


Phe 


He 


Glu Arg 


Leu 


Lys 


Leu 


Val 


Cys 


Ser 


225 




230 








235 






240 


Pro 


Gin 


Pro Asp Trp Gly 


Pro 


Phe 


Leu Ala 


Gin 


His 


Arg 


Gly 


Glu 


Arg 






245 






250 








255 


Tyr 


Lys 


Asn Met He Asp 


Pro 


Leu 


Gly Thr 


Ser 


Ser 


Leu 


Gly 
270 


Leu 


Lys 






260 






265 











Leu Pro Val Lys Asp Leu Glu Leu 



(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



ATCTTTGCCG GCTTCGTCAT CTTCTCCGTT ATCGGCTTCA TGGCCAATGA ACGCAAAGTC 60 

AACATTGAGA ATGTGGCAGA CCAAGGGCCA GGCATTGCAT TTGTGGTTTA CCCGGAAGCC 120 

TTAACCAGGC TGCCTCTCTC TCCGTTCTGG GCCATCATCT TTTTCCTGAT GCTCCTCACT 180 

CTTGGACTTG ACACTATGTT TGCCACCATC GAGACCATAG TGACCTCCAT CTCAGACGAG 24 0 

TTTCCCAAGT ACCTACGCAC ACACAAGCCA GTGTTTACTC TGGGCTGCTG CATTTGTTTC 300 

TTCATCATGG GTTTTCCAAT GATCACTCAG GGTGGAATTT ACATGTTTCA GCTTGTGGAC 360 

ACCTATGCTG CCTCCTATGC CCTTGTCATC ATTGCCATTT TTGAGCTCGT GGGGATCTCT 420 

TATGTGTATG GCTTGCAAAG ATTCTGTGAA GATATAGAGA TGATGATTGG ATTCCAGCCT 480 

AACATCTTCT GGAAAGTCTG CTGGGCATTT GTAACCCCAA CCATTTTAAC CTTTATCCTT 540 

TGCTTCAGCT TTTACCAGTG GGAGCCCATG ACCTATGGCT CTTACCGCTA TCCTAACTGG 600 

TCCATGGTGC TCGGATGGCT AATGCTCGCC TGTTCCGTCA TCTGGATCCC AATTATGTTT 660 

GTGATAAAAA TGCATCTGGC CCCTGGAAGA TTTATTGAGA GGCTGAAGTT GGTGTGCTCG J720 

CCACAGCCGG ACTGGGGCCC ~ATTCTTAGCT CAACACCGCG GGGAGCGTTA CAAGAACATG 7 80 

ATCGACCCCT TGGGAACCTC TTCCTTGGGA CTCAAACTGC CAGTGAAGGA TTTGGAACTG 840 



(2} INFORMAT TON FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 280 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



He 


Phe 


Ala 


Gly 


Phe 


Val 


He 


Phe 


1 






5 








Glu 


Arg 


Lys 


Val 


Asn 


He 


Glu 


Asn 




20 










Ala 


Phe 


Val 


Val 


Tyr 


Pro 


Glu 


Ala 






35 








40 


Phe 


Trp 
50 


Ala 


He 


He 


Phe 


Phe 
55 


Leu 


Thr 


Met 


Phe 


Ala 


Thr 


He 


Glu 


Thr 


65 










70 






Phe 


Pro 


Lys 


Tyr 


Leu 
85 


Arg 


Thr 


His 



Ser 


Val 


He 


Gly 


Phe 


Met 


Ala 


Asn 




10 








15 




Val 


Ala 


Asp 


Gin 


Gly 


Pro 


Gly 


He 


25 










30 






Leu 


Thr 


Arg 


Leu 


Pro 


Leu 


Ser 


Pro 








45 








Met 


Leu 


Leu 


Thr 


Leu 


Gly 


Leu 


Asp 








60 






He 


Val 


Thr 


Ser 


He 


Ser 


Asp 


Glu 






75 








80 


Lys 


Pro 


Val 


Phe 


Thr 


Leu 


Gly 


Cys 




90 










95 
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Cys He Cys 

He Tyr Met 
115 

Val He He 

130 
Leu Gin Arg 
145 

Asn He Phe 

Thr Phe He 

Gly Ser Tyr 
195 

Leu Ala Cys 

210 
His Leu Ala 
225 

Pro Gin Pro 

Tyr Lys Asn 

Leu Pro Val 
275 



Phe Phe 
100 

Phe Gin 

Ala He 

Phe Cys 

Trp Lys 
165 
Leu Cys 
180 

Arg Tyr 

Ser Val 

Pro Gly 

Asp Trp 
245 
Met He 
260 

Lys Asp 



He Met 

Leu Val 

Phe Glu 
135 
Glu Asp 
150 

Val Cys 

Phe Ser 

Pro Asn 

He Trp 
215 
Arg Phe 
230 

Gly Pro 
Asp Pro 
Leu Glu 



Gly Phe 
105 
Asp Thr 
120 

Leu Val 

He Glu 

Trp Ala 

Phe Tyr 
185 
Trp Ser 
200 

He Pro 

He Glu 

Phe Leu 

Leu Gly 
265 

Leu 
280 



Pro Met 

Tyr Ala 

Gly lie 

Met Met 
155 
Phe Val 
170 

Gin Trp 

Met Val 

He Met 

Arg Leu 
235 
Ala Gin 
250 

Thr Ser 



He Thr Gin 
110 

Ala Ser Tyr 

125 
Ser Tyr Val 
140 

He Gly Phe 

Thr Pro Thr 

Glu Pro Met 
190 

Leu Gly Trp 

205 
Phe Val He 
220 

Lys Leu Val 

His Arg Gly 

Ser Leu Gly 
270 



Gly Gly 

Ala Leu 

Tyr Gly 

Gin Pro 
160 
He Leu 
175 

Thr Tyr 

Leu Met 

Lys Met 

Cys Ser 
240 
Glu Arg 
255 

Leu Lys 



(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2397 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATGGATTGCA GTGCTCCCAA GGAAATGAAT AAACTGCCAG CCAACAGCCC 
GCGGCGCAGG GCCACCCGGA TGGCCCATGC GCTCCCAGGA CGAGCCCGGA 
CCCGCGGCTG CCGCCCCGCC GCCGCCACGT GTGCCCAGGT CCGCTTCCAC 
ACTTTCCAGT CAGCGGACGC GCGAGCCTGC GAGGCTGAGC GGCCAGGAGT 
AAACTCAGTA GCCCGCGGGC GCAGGCGGCC TCTGCAGCTC TGCGGGACTT 
CAAAGCGCGC AGGCCTCGCC CCCTCCCGGG AGCTCCGGGC CCGGCAACGC 
AAGATCCCTT CTCTGCGAGG CCCGGAGGGG GATGCGAACG TGAGTGTGGG 
CTGGAGCGGA ACAATACCCC TGTTGTGGGC TGGGTGAACA TGAGCCAGAG 
CTGGGCACGG ATGGAATCAC GTCCGTGCTC CCGGGCAGCG TGGCCACCGT 
GAGGACGAGC AAGGGGATGA GAATAAGGCC CGAGGGAACT GGTCCAGCAA 
ATCCTGTCCA TGGTGGGGTA CGCAGTGGGG CTGGGCAATG TCTGGAGGTT 
GCCTTCCAGA ACGGGGGAGG TGCTTTCCTC ATCCCTTACC TGATGATGCT 
GGATTACCCA TCTTCTTCTT - GGAGGTGTCG - CTGGGCCAGT TTGCCAGCCA 
TCTGTGTGGA AGGCCATCCC AGCTCTACAA GGCTGTGGCA TCGCGATGCT 
GTCCTAATAG CCATATACTA CAATGTGATT ATTTGCTATA CACTTTTCTA 
TCCTTTGTGT CTGTACTACC CTGGGGCTCC TGCAACAACC CTTGGAATAC 
AAAGATAAAA CCAAACTTTT ATTAGATTCC TGTGTTATCA GTGACCATCC 
ATCAAGAACT CGACTTTCTG CATGACCGCT TATCCCAACG TGACAATGGT 
AGCCAGGCCA ATAAGACATT TGTCAGTGGA AGTGAAGAGT ACTTCAAGTA 
AAGATTTCTG CAGGGATTGA ATATCCTGGC GAGATCAGGT GGCCACTAGC 
TTCCTGGCTT GGGTCATTGT GTATGCATCG TTGGCTAAAG GAATCAAGAC 
GTGGTGTACT TCACGGCCAC GTTCCCGTAT GTCGTACTCG TGATCCTCCT 
GTCACCCTGC CTGGAGCTGG AGCTGGGATC TGGTACTTCA TCACACCCAA 
CTCACGGATG CCACGGTGTG GAAAGATGCT GCCACTCAGA TTTTCTTCTC 
GCATGGGGAG GCCTGATCAC TCTCTCTTCT TACAACAAAT TCCACAACAA 
GACACTCTAA TTGTCACCTG CACCAACAGT GCCACAAGCA TCTTTGCCGG 
TTCTCCGTTA TCGGCTTCAT GGCCAATGAA CGCAAAGTCA ACATTGAGAA 
CAAGGGCCAG GCATTGCATT TGTGGTTTAC CCGGAAGCCT TAACCAGGCT 
CCGTTCTGGG CCATCATCTT TTTCCTGATG CTCCTCACTC TTGGACTTGA 
GCCACCATCG AGACCATAGT GACCTCCATC TCAGACGAGT TTCCCAAGTA 
CACAAGCCAG TGTTTACTCT GGGCTGCTGC ATTTGTTTCT TCATCATGGG 
ATCACTCAGG GTGGAATTTA CATGTTTCAG CTTGTGGACA CCTATGCTGC 



GGAGGCGGCG 
GCAGGAGCTT 
CGGCGCCCAA 
GGGGTCTTGC 
GAGAGAGGCG 
GCTGCACTGT 
CAAGGGCACC 
CACCGTGGTG 
TGCCACCCAG 
ACTGGACTTC " 
TCCCTACCTG 
GGCTCTGGCT 
GGGACCAGTG - 
GATCATCTCT 
CCTGTTTGCC 
GCCAGAATGC 
CAAAATACAG 
TAATTTCACC 
CTTTGTGCTG 
TCTCTGCCTC 
TTCAGGAAAA 
CATCCGAGGA 
GTGGGAGAAA 
TTTATCTGCT 
CTGCTACAGG 
CTTCGTCATC 
TGTGGCAGAC 
GCCTCTCTCT 
CACTATGTTT 
CCTACGCACA 
TTTTCCAATG 
CTCCTATGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
"600 
660 
720 



7"8TT 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 ■ 
1860 
1920 
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CTTGTCATCA TTGCCATTTT TGAGCTCGTG GGGATCTCTT ATGTGTATGG CTTGCAAAGA 1980 

TTCTGTGAAG ATATAGAGAT GATGATTGGA TTCCAGCCTA ACATCTTCTG GAAAGTCTGC 204 0 

TGGGCATTTG TAACCCCAAC CATTTTAACC TTTATCCTTT GCTTCAGCTT TTACCAGTGG 2100 

GAGCCCATGA CCTATGGCTC TTACCGCTAT CCTAACTGGT CCATGGTGCT CGGATGGCTA 2160 

ATGCTCGCCT GTTCCGTCAT CTGGATCCCA ATTATGTTTG TGATAAAAAT GCATCTGGCC 2220 

CCTGGAAGAT TTATTGAGAG GCTGAAGTTG GTGTGCTCGC CACAGCCGGA CTGGGGCCCA 2280 

TTCTTAGCTC AACACCGCGG GGAGCGTTAC AAGAACATGA TCGACCCCTT GGGAACCTCT 234 0 

TCCTTGGGAC TCAAACTGCC AGTGAAGGAT TTGGAACTGG GCACTCAGTG CTAGTCC 2397 

(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 797 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



Met 


Asp 


Cys 


Ser 


Ala 


Pro 


Lys 


Glu 


Met 


Asn 


Lys 


Leu 


P m 


2X1 a 
f\X a. 


Asn 




1 








5 










10 












Pro 


Glu 


Ala 


Ala 


Ala 


Ala 


Gin 


Gl v 


His 


Pro 


As p 


Gly 


JT L U 




Zvl a 
/I J. a 


tr ro 








20 










25 






J) u 






Arg 


Thr 


Ser 
35 


Pro 


Glu 


Gin 


Glu 


Leu 
40 


Pro 


Ala 


Ala 


Ala 


A1 A 

A C. 
H J 


Pro 


Pro 


Pro 


Pro 


Arg 


Val 


Pro 


Arg 


Ser 


Ala 


Ser 


Thr 


Glv 


Ala 


Gin 


Thr 


Phe 


Gin 


JC JL 




50 










55 








60 










Ala 


Asd 


Ala 


Arg 


Ala 


Cvs 


Glu 


Ala 


Glu 


Ara 


Pro 


Glv 


Val 


Gly 


Ser 


Cys 


65 










70 










75 








80 


Lys 


Leu 


Ser 


Ser 


Pro 


Arg 


Ala 


Gin 


Ala 


Ala 


Ser 


Ala 


Ala 


Leu 


Ara 


Asp 










85 










.90 










95 


Leu 


Arg 


Glu 


Ala 
100 


Gin 


Ser 


Ala 


Gin 


Ala 
105 


Ser 


Pro 


Pro 


Pro 


Gly 
110 


Ser 


Ser 


Gly 


Pro 


Gly 


Asn 


Ala 


Leu 


His 


Cys 


Lys 


He 


Pro 


Ser 


Leu 


Arq 


Gly 


Pro 






115 










120 










125 




Glu 


Gly 


Asp 


Ala 


Asn 


Val 


Ser 


Val 


Gly 


Lys 


Gly 


Thr 


Leu 


Glu 


Arg 


Asn 




130 










135 










140 








Asn 


Thr 


Pro 


Val 


Val 


Gly 


Trp 


Val 


Asn 


Met 


Ser 


Gin 


Ser 


Thr 


Val 


Val 


145 










150 








155 










160 


Leu 


Gly 


Thr 


Asp 


Gly 


He 


Thr 


Ser 


Val 


Leu 


Pro 


Gly 


Ser 


Val 


Ala 


Thr 








165 










170 








175 




Val 


Ala 


Thr 


Gin 


Glu 


Asp 


Glu 


Gin 


Gly 


Asp 


Glu 


Asn 


Lys 


Ala 


Arg 


Gly 








180 










185 










190 


Asn 


Trp 


Ser 


Ser 


Lys 


Leu 


Asp 


Phe 




Leu 


__S_er 


_Met_ 


Val 


Gly 


Tyr 


Ala 






195 






200 










205 




Val 


Gly 


Leu 


Gly 


Asn 


Val 


Trp 


Arg 


Phe 


Pro 


Tyr 


Leu 


Ala 


Phe 


Gin 


Asn 




210 








215 










??0 










Gly 


Gly 


Gly 


Ala 


Phe 


Leu 


He 


Pro 


Tyr 


Leu 


Met 


Met 


Leu 


Ala 


Leu 


Ala 


225 










230 










235 










240 


Gly 


Leu 


Pro 


He 


Phe 


Phe 


Leu 


Glu 


Val 


Ser 


Leu 


Gly 


Gin 


Phe 


Ala 


Ser 










245 










250 








255 




Gin 


Gly 


Pro 


Val 


Ser 


Val 


Trp 


Lys 


Ala 


He 


Pro 


Ala 


Leu 


Gin 


Gly 


Cys 








260 










265 










270 


Gly 


He 


Ala 


Met 


Leu 


He 


He 


Ser 


Val 


Leu 


He 


Ala 


He 


Tyr 


Tyr 


Asn 






275 










280 










285 




Val 


He 


He 


Cys 


Tyr 


Thr 


Leu 


Phe 


Tyr 


Leu 


Phe 


Ala 


Ser 


Phe 


Val 


Ser 




290 






295 










300 










Val 


Leu 


Pro 


Trp 


Gly 


Ser 


Cys 


Asn 


Asn 


Pro 


Trp 


Asn 


Thr 


Pro 


Glu 


Cys 


305 








310 










315 










320 


Lys 


Asp 


Lys 


Thr 


Lys 
325 


Leu 


Leu 


Leu 


Asp 


Ser 
330 


Cys 


Val 


He 


Ser 


Asp 
335 


His 


Pro 


Lys 


He 


Gin 


He 


Lys 


Asn 


Ser 


Thr 


Phe 


Cys 


Met 


Thr 


Ala 


Tyr 


Pro 








340 










345 










350 




Asn 


Val 


Thr 
355 


Met 


Val 


Asn 


Phe 


Thr 

360 


Ser 


Gin 


Ala 


Asn 


Lys 
365 


Thr 


Phe 


Val 


Ser 


Gly 
370 


Ser 


Glu 


Glu 


Tyr 


Phe 
375 


Lys 


Tyr 


Phe 


Val 


Leu 

380 


Lys 


He 


Ser 


Ala 
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Gly He Glu Tyr Pro Gly Glu He Arg Trp Pro Leu Ala 

385 390 395 

Phe Leu Ala Trp Val He Val Tyr Ala Ser Leu Ala Lys 

405 410 

Thr Ser Gly Lys Val Val Tyr Phe Thr Ala Thr Phe Pro 

420 425 

Leu Val He Leu Leu He Arg Gly Val Thr Leu Pro Gly 

435 440 445 

Gly He Trp Tyr Phe He Thr Pro Lys Trp Glu Lys Leu 

450 455 460 

Thr Val Trp Lys Asp Ala Ala Thr Gin He Phe Phe Ser 

465 470 475 

Ala Trp Gly Gly Leu He Thr Leu Ser Ser Tyr Asn Lys 

485 490 

Asn Cys Tyr Arg Asp Thr Leu He Val Thr Cys Thr Asn 

500 505 

Ser He Phe Ala Gly Phe Val He Phe Ser Val He Gly 

515 520 525 

Asn Glu Arg Lys Val Asn He Glu Asn Val Ala Asp Gin 

530 535 540 

He Ala Phe Val Val Tyr Pro Glu Ala Leu Thr Arg Leu 

545 550 555 

Pro Phe Trp Ala He He Phe Phe Leu Met Leu Leu Thr 

565 570 

Asp Thr Met Phe Ala Thr He Glu Thr He Val Thr Ser 

580 585 

Glu Phe Pro Lys Tyr Leu Arg Thr His Lys Pro Val Phe 

595 600 605 

Cys Cys He Cys Phe Phe He Met Gly Phe Pro Met He 

610 615 620 

Gly He Tyr Met Phe Gin Leu Val Asp Thr Tyr Ala Ala 

625 630 635 

Leu Val He He Ala He Phe Glu Leu Val Gly He Ser 

645 650 

Gly Leu Gin Arg Phe Cys Glu Asp He Glu Met Met He 

660 665 

Pro Asn He Phe Trp Lys Val Cys Trp Ala Phe Val Thr 

675 680 685 

Leu Thr Phe He Leu Cys Phe Ser Phe Tyr Gin Trp Glu 

690 695 700 

Tyr Gly Ser Tyr Arg Tyr Pro Asn Trp Ser Met Val Leu 

705 710 715 

Met-Leu Ala Cys Ser Val He Trp He -Pro~ITe"Met"Phe~ 

725 730 

Met His Leu Ala Pro Gly Arg Phe He Glu Arg Leu Lys 
74Q T4 _ 5 



(2) INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2397 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



ATGGATTGCA GTGCTCCCAA GGAAATGAAT AAACTGCCAG CCAACAGCCC GGAGGCGGCG 60 

GCGGCGCAGG GCCACCCGGA TGGCCCATGC GCTCCCAGGA CGAGCCCGGA GCAGGAGCTT 120' 

CCCGCGGCTG CCGCCCCGCC GCCGCCACGT GTGCCCAGGT CCGCTTCCAC CGGCGCCCAA 180 

ACTTTCCAGT CAGCGGACGC GCGAGCCTGC GAGGCTGAGC GGCCAGGAGT GGGGTCTTGC 24 0 
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AAACTCAGTA 

CAAGGCGCGC 

AAGATCCCTT 

CTGGAGCGGA 

CTGGGCACGG 

GAGGACGAGC 

ATCCTGTCCA 

GCCTTCCAGA 

GGATTACCCA 

TCTGTGTGGA 

GTCCTAATAG 

TCCTTTGTGT 

AAAGATAAAA 

ATCAAGAACT 

AGCCAGGCCA 

AAGATTTCTG 

TTCCTGGCTT 

GTGGTGTACT 

GTCACCCTGC 

CTCACGGATG 

GCATGGGGAG 

GACACTCTAA 

TTCTCCGTTA 

CAAGGGCCAG 

CCGTTCTGGG 

GCCACCATCG 

CACAAGCCAG 

ATCACTCAGG 

CTTGTCATCA 

TTCTGTGAAG" 

TGGGCATTTG 

GAGCCCATGA 

ATGCTCGCCT 

CCTGGAAGAT 

TTCTTAGCTC 

TCCTTGGGAC 



GCCCGCGGGC 
AGGCCTCGCC 
TTCTGCGAGG 
ACAATACCCC 
ATGGAATCAC 
AAGGGGATGA 
TGGTGGGGTA 
ACGGGGGAGG 
TCTTCTTCTT 
AGGCCATCCC 
CCATATACTA 
CTGTACTACC 
CCAAACTTTT 
CGACTTTCTG 
ATAAGACATT 
CAGGGATTGA 
GGGTCATTGT 
TCACGGCCAC 
CTGGAGCTGG 
CCACGGTGTG 
GCCTGATCAC 
TTGTCACCTG 
TCGGCTTCAT 
GCATTGCATT 
CCATCATCTT 
AGACCATAGT 
TGTTTACTCT 
GTGGAATTTA 
TTGCCATTTT 
ATATAGAGAT 
TAACCCCAAC 
CCTATGGCTC 
GTTCCGTCAT 
TTATTGAGAG 
AACACCGCGG 
TCAAACTGCC 



GCAGGCGGCC 

CCCTCCCGGG 

CCCGGAGGGG 

TGTTGTGGGC 

GTCCGTGCTC 

GAATAAGGCC 

CGCAGTGGGG 

TGCTTTCCTC 

GGAGGTGTCG 

AGCTCTACAA 

CAATGTGATT 

CTGGGGCTCC 

ATTAGATTCC 

CATGACCGCT 

TGTCAGTGGA 

ATATCCTGGC 

GTATGCATCG 

GTTCCCGTAT 

AGCTGGGATC 

GAAAGATGCT 

TCTCTCTTCT 

CACCAACAGT 

GGCCAATGAA 

TGTGGTTTAC 

TTTCCTGATG 

GACCTCCATC 

GGGCTGCTGC 

CATGTTTCAG 

TGAGCTCGTG 

GATGATTGGA 

CATTTTAACC 

TTACCGCTAT 

CTGGATCCCA 

GCTGAAGTTG 

GGAGCGTTAC 

AGTGAAGGAT 



TCTGCAGCTC 

AGCTCCGGGC 

GATGCGAACG 

TGGGTGAACA 

CCGGGCAGCG 

CGAGGGAACT 

CTGGGCAATG 

ATCCCTTACC 

CTGGGCCAGT 

GGCTGTGGCA 

ATTTGCTATA 

TGCAACAACC 

TGTGTTATCA 

TATCCCAACG 

AGTGAGGAGT 

GAGATCAGGT 

TTGGCTAAAG 

GTCGTACTCG 

TGGTACTTCA 

GCCACTCAGA 

TACAACAAAT 

GCCACAAGCA 

CGCAAAGTCA 

CCGGAAGCCT 

CTCCTCACTC 

TCAGACGAGT 

GTTTGTTTCT 

CTTGTGGACA 

GGGATCTCTT 

TTCCAGCCTA 

TTTATCCTTT 

CCTAACTGGT 

ATTATGTTTG 

GTGTGCTCGC 

AAGAACATGA 

TTGGAACTGG 



TGCGGGACTT 

CCGGCAACGC 

TGAGTGTGGG 

TGAGCCAGAG 

TGGCCACCGT 

GGTCCAGCAA 

TCTGGAGGTT 

TGATGATGCT 

TTGCCAGCCA 

TCGCGATGCT 

CACTTTTCTA 

CTTGGAATAC 

GTGACCATCC 

TGACAATGGT 

ACTTCAAGTA 

GGCCACTAGC 

GAATCAAGAC 

TGATCCTCCT 

TCACACCCAA 

TTTTCTTCTC 

TCCACAACAA 

TCTTTGCCGG 

ACATTGAGAA 

TAACCAGGCT 

TTGGACTTGA 

TTCCCAAGTA 

TCATCATGGG 

CCTATGCTGC 

ATGTGTATGG - 

ACATCTTCTG 

GCTTCAGCTT 

CCATGGTGCT 

TGATAAAAAT 

CACAGCCGGA 

TCGACCCCTT 

GTACTCAATG 



GAGAGAGGCG 

GCTGCACTGT 

CAAGGGCACC 

CACCGTGGTG 

TGCCACCCAG 

ACTGGACTTC 

TCCCTACCTG 

GGCTCTGGCT 

GGGACCAGTG 

GATCAACTCT 

CCTGTTTGCC 

GCCAGAATGC 

CAAAATACAG 

TAATTTCACC 

CTTTGTGCTG 

TCTCTGCCTC 

TTCAGGAAAA 

CATCCGAGGA 

GTGGGAGAAA 

TTTATCTGCT 

CTGCTACAGG 

CTTCGTCATC 

TGTGGCAGAC 

GCCTCTCTCT 

CACTATGTTT 

CCTACGCACA 

TTTTCCAATG 

CTCCTATGCC 

CTTGCAAAGA 

GAAAGTCTGC 

TTACCAGTGG 

CGGATGGCTA 

GCATCTGGCC 

CTGGGGCCCA 

GGGAACCTCT 

TTAATCC 



(2) INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 797 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
— "CD")" "TOPOLOGY:" linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 21: 



Met Asp Cys 
1 

Pro Glu Ala 

Arg Thr Ser 
35 

Pro Arg Val 
50 

Ala Asp Ala 
65 

Lys Leu Ser 

Leu Arg Glu 

Gly Pro Gly 
115 

Glu Gly Asp 

130 
Asn Thr Pro 
145 



Ser Ala 
5 

Ala Ala 
20 

Pro Glu 

Pro Arg 

Arg Ala 

Ser Pro 

85 
Ala Gin 
100 

Asn Ala 
Ala Asn 
Val Val 



Pro Lys 

Ala Gin 

Gin Glu 

Ser Ala 

55 
Cys 'Glu 
70 

Arg Ala 

Gly Ala 

Leu His 

Val Ser 
135 
Gly Trp 
150 



Glu Met Asn 
10 

Gly His Pro 
25 

Leu Pro Ala 
40 

Ser Thr Gly 

Ala Glu Arg 

Gin Ala Ala 
90 

Gin Ala Ser 

105 
Cys Lys He 
120 

Val Gly Lys 
Val Asn Met 



Lys 

Asp 

Ala 

Ala 

Pro 

75 

Ser 

Pro 

Pro 

Gly 

Ser 
155 



Leu 


Pro 


Ala 


Asn 
15 


Ser 


Gly 


Pro 


Cys 
30 


Ala 


Pro 


Ala 


Ala 
45 


Pro 


Pro 


Pro 


Gin 


Thr 


Phe 


Gin 


Ser 


60 










Gly 


Val 


Gly 


Ser 


Cys 
80 


Ala 


Ala 


Leu 


Arg 


Asp 








95 


Pro 


Pro 


Gly 
110 


Ser 


Ser 


Phe 


Leu 


Arg 


Gly 


Pro 




125 






Thr 


Leu 


Glu 


Arg 


Asn 


140 








Gin 


Ser 


Thr 


Val 


Val 
160 



300 
360 
420 
480 
54 0 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2397 
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Leu Gly Thr Asp Gly He Thr Ser Val Leu Pro Gly Ser Val Ala Thr 

165 ivo 
Val Ala Thr Gin Glu Asp Glu Gin Gly Asp Glu Asn Lys Ala Arq Glv 

180 185 190 

Asn Trp Ser Ser Lys Leu Asp Phe lie Leu Ser Met Val Gly Tyr Ala 

195 200 205 

Val Gly Leu Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala Phe Gin Asn 

* L{i 215 220 

Gly Gly Gly Ala Phe Leu lie Pro Tyr Leu Met Met Leu Ala Leu Ala 

230 235 7An 

Gly Leu Pro lie Phe Phe Leu Glu Val Ser Leu Gly Gin Phe Ala Ser 

„, 245 250 255 

Gin Gly Pro Val Ser Val Trp Lys Ala lie Pro Ala Leu Gin Gly Cys 

260 265 270 

Gly lie Ala Met Leu lie Asn Ser Val Leu lie Ala He Tyr Tyr Asn 

2,5 280 285 

Val lie lie Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser Phe Val Ser 

" u 295 300 

305 LSU Pr ° Trp Gly Hi Cys Asn A*" Pro Tr P Asn Thr Pro Glu Cys 
Lys Asp Lys Thr Lys Leu Leu Leu Asp Ser Cys Val lie Ser Asp His 
Pro Lys lie Gin lie Lys Asn Ser Thr Phe Cys Met Thr Ala Tyr Pro 

Asn Val Thr Met Val Asn Phe Thr Ser Gin Ala Asn Lys Thr Phe Val 

360 365 
Ser Gly Ser Glu Glu Tyr Phe- Lys Tyr Phe Val Leu Lys lie Ser Ala 

u 375 380 

Gly lie Glu Tyr Pro Gly Glu lie Arg Trp Pro Leu Ala Leu Cys Leu 

Phe Leu Ala Trp Val lie Val Tyr Ala Ser III Ala Lys Gly He lys 

405 410 415 

Thr Ser Gly Lys Val Val Tyr Phe Thr Ala Thr Phe Pro Tyr Vai Val 

Leu Val lie Leu Leu He Arg Gly Val Thr Leu Pro Gly Ala Gly Ala 

Gly He Trp Tyr Phe He Thr Pro Lys Trp Glu Lys Leu Thr Asp Ala 

HDKJ 455 
Thr Val Trp Lys Asp Ala Ala Thr Gin He Phe Phe Ser Leu Ser Ala 

Ala Trp Gly Gly Leu He Thr Leu Ser Ser Tyr Asn Lys Phe His tin 

* 485 490 495 

Asn Cys Tyr-Arg-Asp- Thr- Leu- He Val Thr Cys Thr Asn Ser" Ata-Thr— 

•>°0 505 sio 

Ser lie Ph e Ala Gly Phe Val He Phe Ser Val He Gly Phe Met Ala 

520 525 

Asn Glu Arg Lys Val Asn He Glu Asn Val Ala Asp Gin Gly Pro Gly 

lie Ala Phe Val Val Tyr Pro Glu Ala Leu Thr Arg Leu Pro Leu Ser 

550 555 c An 

Pro Phe Trp Ala lie lie Phe Phe Leu Met Leu Leu Thr Leu Gly Leu 

Asp Thr Met Phe Ala Thr lie Glu Thr He Val Thr Ser He Tel Asp 
„ L b8 0 585 59Q * 

Glu Phe Pro Lys Tyr Leu Arg Thr His Lys Pro Val Phe Thr Leu Gly 

595 600 605 

Cys Cys Val Cys Phe Phe lie Met Gly Phe Pro Met lie Thr Gin Gly 

olu 615 620 

Gly He Tyr Met Phe Gin Leu Val Asp Thr Tyr Ala Ala Ser Tyr Ala 

630 635 cAf\ 

Leu Val He He Ala lie Phe Glu Leu Val Gly He Ser Tyr Val Tyr 

6 " 650 ctic 3 

Gly Leu Gin Arg Phe Cys Glu Asp He Glu Met Met He Gly Phe Gin 

b0 665 670 

Pro Asn He Phe Trp Lys Val Cys Trp Ala Phe Val Thr Pro Thr He 
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675 








680 








685 








Leu 


Thr 
690 


Phe 


He 


Leu 


Cys Phe 
695 


Ser 


Phe Tyr 


Gin 


Trp 
700 


Glu 


Pro 


Met 


Thr 


Tyr 


Gly 


Ser 


Tyr 


Arg 


Tyr Pro 


Asn 


Trp Ser 


Met 


Val 


Leu 


Gly 


Trp 


Leu 


705 






710 






715 










720 


Met 


Leu 


Ala 


Cys 


Ser 


Val He 


Trp 


He Pro 


He 


Met 


Phe 


Val 


He 


Lys 








725 






730 










735 




Met 


His 


Leu 


Ala 


Pro 


Gly Arg 


Phe 


He Glu 


Arg 


Leu 


Lys 


Leu 


Val 


Cys 








740 






745 








750 






Ser 


Pro 


Gin 


Pro Asp 


Trp Gly 


Pro 


Phe Leu 


Ala 


Gin 


His 


Arg 


Gly 


Glu 






755 








760 








765 








Arg 


Tyr 


Lys 


Asn 


Met 


He Asp 


Pro 


Leu Gly 


Thr 


Ser 


Ser 


Leu 


Gly 


Leu 


770 






775 








780 










Lys 


Leu 


Pro 


Val 


Lys 


Asp Leu 


Glu 


Leu Gly 


Thr 


Gin 


Cys 








785 










790 






795 













(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 589 base pairs 

{B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

<D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 

AGTGTTTACT CTGGGCTGCT ACATTTGTTT CTTCATCATG GGTTTTCCAA TGATCACTCA 60 

GGGTGGAATT TACATGTTTC AGCTTGTGGA CACCTATGCT GCCTCCTATG CCCTTGTCAT 120 

CATTGCCATT TTTGAGCTCG TGGGGATCTC TTATGTGTAT GGCTTGCAAA GATTCTGTGA 180 

AGATATAGAG ATGATGATTG GATTCCAGCC TAACATCTTC TGGAAAGTCT GCTGGGCATT 24 0 

TGTAACCCCA ACCATTTTAA CCTTTATCCT TTGCTTCAGC TTTTACCAGT GGGAGCCCAT 300 

GACCTATGGC TCTTACCGCT ATCCTAACTG GTCCATGGTG CTCGGATGGC TAATGCTCGC 360 

CTGTTCCGTC ATCTGGATCC CAATTATGTT TGTGGTAAAA ATGCATCTGG CCCCTGGAAG 420 

ATTTATTGAG AGGCTGAAGT TGGTGTGCTC GCCACAGCCG GACTGGGGCC CATTCTTAGC 480 

TCAACACCGC GGGGAGCGTT ACAAGAACAT GATCGACCCC TTGGGAACCT CTTCCTTGGG 540 

ACTCAAACTG CCAGTGAAGG ATTTGGAACT GGGCACTCAG TGCTAGTCC 589 

(2) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCR IPTI ON: SEQ ID NO: 23: 



Val 


Phe 


Thr 


Leu 


Gly 


Cys 


Tyr 


He 


Cys 


Phe 


Phe 


He 


Met 


Gly 


Phe 


Pro 


1 








5 








10 










15.... 




Met 


He 


Thr 


Gin 


Gly 


Gly 


He 


Tyr 


Met 


Phe 


Gin 


Leu 


Val 


Asp 


Thr 


Tyr 








20 






25 










30 






Ala 


Ala 


Ser 


Tyr 


Ala 


Leu 


Val 


He 


lie 


Ala 


He 


Phe 


Glu 


Leu 


Val 


Gly 






35 








40 










45 








He 


Ser 


Tyr 


Val 


Tyr 


Gly 


Leu 


Gin 


Arg 


Phe 


Cys 


Glu 


Asp 


He 


Glu 


Met 




50 






55 










60 










Met 


He 


Gly 


Phe 


Gin 


Pro 


Asn 


He 


Phe 


Trp 


Lys 


Val 


Cys 


Trp Ala 


Phe 


65 








70 










75 










80 


Val 


Thr 


Pro 


Thr 


He 
85 


Leu 


Thr 


Phe 


He 


Leu 
90 


Cys 


Phe 


Ser 


Phe 


Tyr 
95 


Gin 


Trp 


Glu 


Pro 


Met 


Thr 


Tyr 


Gly 


Ser 


Tyr 


Arg 


Tyr 


Pro 


Asn 


Trp 


Ser 


Met 






100 










105 










110 






Val 


Leu 


Gly 


Trp 


Leu 


Met 


Leu 


Ala 


Cys 


Ser 


Val 


He 


Trp 


He 


Pro 


He 






115 








120 










125 








Met 


Phe 


Val 


Val 


Lys 


Met 


His 


Leu 


Ala 


Pro 


Gly 


Arg 


Phe 


He 


Glu 


Arg 




130 








135 










140 










Leu 


Lys 


Leu 


Val 


Cys 


Ser 


Pro 


Gin 


Pro 


Asp 


Trp 


Gly 


Pro 


Phe 


Leu 


Ala 


145 






150 










155 










160 


Gin 


His 


Arg 


Gly 


Glu 


Arg 


Tyr 


Lys 


Asn 


Met 


He 


Asp 


Pro 


Leu Gly Thr 






165 










170 










175 
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Ser Ser Leu Gly Leu Lys Leu Pro Val Lys Asp Leu Glu Leu Gly Thr 
180 185 190 

Gin Cys 



(2) INFORMATION FOR SEQ ID NO: 24: 
<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 589 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

AGTGTTTACT CTGGGCTGCT GCATTTGTTT CTTCATCATG GGTTTTCCAA TGATCACTCA fin 

GGGTGGAATT TACATGTTTC AGCTTGTGGA CACCTATGCT GCCTCCTATG CCCTTGTCAT IPO 

CATTGCCATT TTTGAGCTCG TGGGGATCTC TTATGTGTAT GGCTTGCAAA GATTCTGTGA 180 

AGATATAGAG ATGATGATTG GATTCCAGCC TAACATCTTC TGGAAAGTCT GCTGGGCATT ?40 

TGTAACCCCA ACCATTTTAA CCTTTATCCT TTGCTTCAGC TTTTACCAGT GGGAACCCAT 300 

GACCTATGGC TCTTACCGCT ATCCTAACTG GTCCATGGTG CTCGGATGGC TAATGCTCGC 360 

CTGTTCCGTC ATCTGGATCC CAATTATGTC TGTGATAAAA ATGCATCTGG CCCCTGGAAG 420 

ATTTATTGAG AGGCTGAAGT TGGTGTGCTC GCCACAGCCG GACTGGGGCC CATTCTTAGC 480 

TCAACACCGC GGGGAGCGTT ACAAGAACAT GATCGACCCC TTGGGAACCT CTTCCTTGGG S40 

ACTCAAACTG CCAGTGAAGG ATTTGGAACT GGGCACTCAG TGCTAGTCC 599 

(2) INFORMATION FOR SEQ ID NO: 25: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

Val Phe Thr Leu Gly Cys Cys He Cys Phe Phe He Met Gly Phe Pro 

15 10 15 

Met He Thr Gin Gly Gly He Tyr Met Phe Gin Leu Val Asp Thr Tvr 

20 25 30 

Ala Ala Ser Tyr Ala Leu Val He He Ala He Phe Glu Leu Val Glv 
35 40 45 J 

lie Ser Tyr Val Tyr Gly Leu Gin Arg Phe Cys Glu Asp He Glu Met 

50 55 60 

Met He Gly Phe Gin Pro Asn He Phe Trp Lys Val Cys Trp Ala Phe 
65 70 -75 80 

Val Thr Pro Thr He Leu Thr Phe~Ile- Leu~ Cys Phe Ser Phe Tvr Gin ~ 

85 90 95 

Trp Glu Pro Met Thr Tyr Gly Ser Tyr Arg Tyr Pro Asn Trp Ser Met 

100 105 no 

Val Leu Gly Trp Leu Met Leu Ala Cys Ser Val He Trp He Pro He 

115 120 125 

Met Ser Val He Lys Met His Leu Ala Pro Gly Arg Phe lie Glu Arq 

130 135 140 

Leu Lys Leu Val Cys Ser Pro Gin Pro Asp Trp Gly Pro Phe Leu Ala 

150 155 160 

Gin His Arg Gly Glu Arg Tyr Lys Asn Met He Asp Pro Leu Gly Thr 

165 170 275 

Ser Ser Leu Gly Leu Lys Leu Pro Val Lys Asp Leu Glu Leu Gly Thr 

180 185 190 

Gin Cys 13 u 



(2) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2397 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
{D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME/ KEY: Coding Sequence 

(B) LOCATION: 1...2391 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATG GAT TGC AGT GCT CCC AAG GAA ATG AAT AAA CTG CCA GCC AAC AGC 4 8 

Met Asp Cys Ser Ala Pro Lys Glu Met Asn Lys Leu Pro Ala Asn Ser 
15 10 15 

CCG GAG GCG GCG GCG GCG CAG GGC CAC CCG GAT GGC CCA TGC GCT CCC 96 
Pro Glu Ala Ala Ala Ala Gin Gly His Pro Asp Gly Pro Cys Ala Pro 
20 25 30 

AGG ACG AGC CCG GAG CAG GAG CTT CCC GCG GCT GCC GCC CCG CCG CCG 144 
Arg Thr Ser Pro Glu Gin Glu Leu Pro Ala Ala Ala Ala Pro Pro Pro 
35 40 45 

CCA CGT GTG CCC AGG TCC GCT TCC ACC GGC GCC CAA ACT TTC CAG TCA 192 
Pro Arg Val Pro Arg Ser Ala Ser Thr Gly Ala Gin Thr Phe Gin Ser 
50 55 60 

GCG GAC GCG CGA GCC TGC GAG GCT GAG CGG CCA GGA GTG GGG TCT TGC 24 0 

Ala Asp Ala Arg Ala Cys Glu Ala Glu Arg Pro Gly Val Gly Ser Cys 
65 70 75 80 

AAA CTC AGT AGC CCG CGG GCG CAG GCG GCC TCT GCA GCT CTG CGG GAC 2 88 

Lys Leu Ser Ser Pro Arg Ala Gin Ala Ala Ser Ala Ala Leu Arg Asp 

85 90 ... 95 

TTG AGA GAG GCG CAA GGC GCG CAG GCC TCG CCC CCT CCC GGG AGC TCC 3 36 

Leu Arg Glu Ala Gin Gly Ala Gin Ala Ser Pro Pro Pro Gly Ser Ser 
100 105 110 

GGG CCC GGC AAC GCG CTG CAC TGT AAG ATC CCT TCT CTG CGA GGC CCG 384 
Gly Pro Gly Asn Ala Leu His Cys Lys lie Pro Ser Leu Arg Gly Pro 
115 120 125 

GAG GGG GAT GCG AAC GTG AGT GTG GGC AAG GGC ACC CTG GAG CGG AAC 4 32 

Glu Gly Asp Ala Asn Val Ser Val Gly Lys Gly Thr Leu Glu Arg Asn 
130 135 140 

- AAT ACC CCT GTT GTG GGC TGG GTG AAC AT G AGC CAG AGC ACC GTG GTG 480 
Asn Thr Pro Val Val Gly Trp Val Asn Met Ser Gin Ser Thr Val Val 
145 150 155 160 



CTG GGC ACG GAT GGA ATC ACG TCC GTG CTC CCG GGC AGC GTG GCC ACC 528 
Leu Gly Thr Asp Gly lie Thr Ser Val Leu Pro Gly Ser Val Ala Thr 
165 170 175 

GTT GCC ACC CAG GAG GAC GAG CAA GGG GAT GAG AAT AAG GCC CGA GGG 576 
Val Ala Thr Gin Glu Asp Glu Gin Gly Asp Glu Asn Lys Ala Arg Gly 
180 185 190 

AAC TGG TCC AGC AAA CTG GAC TTC ATC CTG TCC ATG GTG GGG TAC GCA 624 
Asn Trp Ser Ser Lys Leu Asp Phe lie Leu Ser Met Val Gly Tyr Ala 
195 200 205 

GTG GGG CTG GGC AAT GTC TGG AGG TTT CCC TAC CTG GCC TTC CAG AAC 672 
Val Gly Leu Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala Phe Gin Asn 
210 215 220 

GGG GGA GGT GCT TTC CTC ATC CCT TAC CTG ATG ATG CTG GCT CTG GCT 720 
Gly Gly Gly Ala Phe Leu lie Pro Tyr Leu Met Met Leu Ala Leu Ala 
225 230 235 240 
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GGA TTA CCC ATC TTC TTC TTG GAG GTG TCG CTG GGC CAG TTT GCC AGC 768 
Gly Leu Pro lie Phe Phe Leu Glu Val Ser Leu Gly Gin Phe Ala Ser 
245 250 255 

CAG GGA CCA GTG TCT GTG TGG AAG GCC ATC CCA GCT CTA CAA GGC TCT - 816 
Gin Gly Pro Val Ser Val Trp Lys Ala He Pro Ala Leu Gin Gly Cvs 
260 265 270 

GGC ATC GCG ATG CTG ATC ATC TCT GTC CTA ATA GCC ATA TAC TAC AAT 864 
Gly He Ala Met Leu He He Ser Val Leu He Ala He Tyr Tyr Asn 
275 280 285 

GTG ATT ATT TGC TAT ACA CTT TTC TAC CTG TTT GCC TCC TTT GTG TCT 912 
Val He He Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser Phe Val Ser 
290 295 300 

GTA CTA CCC TGG GGC TCC TGC AAC AAC CCT TGG AAT ACG CCA GAA TGC 960 
Val Leu Pro Trp Gly Ser Cys Asn Asn Pro Trp Asn Thr Pro Glu Cvs 
305 310 315 320 



AAA GAT AAA ACC AAA CTT TTA TTA GAT TCC TGT GTT ATC AGT GAC CAT 
Lys. Asp Lys Thr Lys Leu Leu Leu Asp Ser Cys Val He Ser Asp His 

325 330 .., -. -- 335 



1008 



CCC AAA ATA CAG ATC AAG AAC TCG ACT TTC TGC ATG ACC GCT TAT CCC 1056 
Pro Lys He Gin He Lys Asn Ser Thr Phe Cys Met Thr Ala Tyr Pro 
340 345 350 

AAC GTG ACA ATG GTT AAT TTC ACC AGC CAG GCC AAT AAG ACA TTT GTC 1104 
Asn Val Thr Met Val Asn Phe Thr Ser Gin Ala Asn Lys* Thr Phe Val 
355 360 365 

AGT GGA AGT GAA GAG TAC TTC AAG TAC TTT GTG CTG AAG ATT TCT GCA 1152 
Ser Gly Ser Glu Glu Tyr Phe Lys Tyr Phe Val Leu Lys He Ser Ala 
370 375 380 

GGG ATT GAA TAT CCT GGC GAG ATC AGG TGG CCA CTA GCT CTC TGC CTC 1200 
Gly He Glu Tyr Pro Gly Glu He Arg Trp Pro Leu Ala Leu Cys Leu 
385 390 395 * 400 

TTC CTG GCT TGG GTC ATT GTG TAT GCA TCG TTG GCT AAA GGA ATC AAG 1248 
Phe Leu Ala Trp Val He Val Tyr Ala Ser Leu Ala Lys Gly He Lys 

4 05 410 415 ._i- 

ACT TCA GGA AAA GTG GTG TAC TTC ACG GCC ACG TTC CCG TAT GTC GTA 1296 
— T-hr-Ser~-G-l-y-Lys"Val"Val Tyr Phe Thr Ala-Thr-Phe-pro^ryr-Val-Val — 
420 425 430 

CTC GTG ATC CTC CTC ATC CGA GGA GTC ACC CTG CCT GGA GCT GGA GCT 134 4 
Leu Val He Leu Leu He Arg Gly Val Thr Leu Pro Gly Ala Glv Ala 
435 440 445 

GGG ATC TGG TAC TTC ATC ACA CCC AAG TGG GAG AAA CTC ACG GAT GCC 1392 
Gly He Trp Tyr Phe He Thr Pro Lys Trp Glu Lys Leu Thr Asp Ala 
450 455 460 

ACG GTG TGG AAA GAT GCT GCC ACT CAG ATT TTC TTC TCT TTA TCT GCT 14 40 
Thr Val Trp Lys Asp Ala Ala Thr Gin He Phe Phe Ser Leu Ser Ala 
465 470 475 480 

GCA TGG GGA GGC CTG ATC ACT CTC TCT TCT TAC AAC AAA TTC CAC AAC 14 88 
Ala Trp Gly Gly Leu He Thr Leu Ser Ser Tyr Asn Lys Phe His Asn 
485 490 495 

AAC TGC TAC AGG GAC ACT CTA ATT GTC ACC TGC ACC AAC AGT GCC ACA 1536 
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Asn Cys Tyr Arg Asp Thr Leu lie Val Thr Cys Thr Asn Ser Ala Thr 

500 505 510 

AGC ATC TTT GCC GGC TTC GTC ATC TTC TCC GTT ATC GGC TTC ATG GCC 1584 

Ser lie Phe Ala Gly Phe Val lie Phe Ser Val lie Gly Phe Met Ala 
515 520 525 

AAT GAA CGC AAA GTC AAC ATT GAG AAT GTG GCA GAC CAA GGG CCA GGC 1632 

Asn Glu Arg Lys Val Asn lie Glu Asn Val Ala Asp Gin Gly Pro Gly 
530 535 540 

ATT GCA TTT GTG GTT TAG CCG GAA GCC TTA ACC AGG CTG CCT CTC TCT 1680 

lie Ala Phe Val Val Tyr Pro Glu Ala Leu Thr Arg Leu Pro Leu Ser 

545 550 555 560 

CCG TTC TGG GCC ATC ATC TTT TTC CTG ATG CTC CTC ACT CTT GGA CTT 1728 

Pro Phe Trp Ala lie lie Phe Phe Leu Met Leu Leu Thr Leu Gly Leu 

565 570 575 

GAC ACT ATG TTT GCC ACC ATC GAG ACC ATA GTG ACC TCC ATC TCA GAC 177 6 

Asp Thr Met Phe Ala Thr He Glu Thr He Val Thr Ser He Ser Asp 

580 585 590 . 

GAG TTT CCC AAG TAC CTA CGC ACA CAC AAG CCA GTG TTT ACT CTG GGC 1824 

Glu Phe Pro Lys Tyr Leu Arg Thr His Lys Pro Val Phe Thr Leu Gly 
595 600 605 

TGC TGC ATT TGT TTC TTC ATC ATG GGT TTT CCA ATG ATC ACT CAG GGT 1872 

Cys Cys He Cys Phe ..Phe He Met Gly Phe Pro Met lie Thr Gin Gly 
610 615 620 

GGA ATT TAC ATG TTT CAG CTT GTG GAC ACC TAT GCT GCC TCC TAT GCC 192 0 

Gly He Tyr Met Phe Gin Leu Val Asp Thr Tyr Ala Ala Ser Tyr Ala 

625 630 635 640 

CTT GTC ATC ATT GCC ATT TTT GAG CTC GTG GGG ATC TCT TAT GTG TAT 1968 

Leu Val He He Ala He Phe Glu Leu Val Gly He Ser Tyr Val Tyr 

645 650 655 

GGC TTG CAA AGA TTC TGT GAA GAT ATA GAG ATG ATG ATT GGA TTC CAG 2016 

Gly Leu Gin Arg Phe Cys Glu Asp He Glu Met Met He Gly Phe Gin 

660 665 670. 

CCT AAC ATC TTC TGG AAA GTC TGC TGG GCA TTT GTA ACC CCA ACC ATT 2064 

Pro Asn H e Phe Tr p Lys _Yai_jCys__Trp Ala Phe Val Thr Pro-T-hr— I-l-e 

675 680 685 

TTA ACC TTT ATC CTT TGC TTC AGC TTT TAC CAG TGG GAG CCC ATG ACC 2112 

Leu Thr Phe He Leu Cys Phe Ser Phe Tyr Gin Trp Glu Pro Met Thr 
690 695 700 

TAT GGC TCT TAC CGC TAT CCT AAC TGG TCC ATG GTG CTC GGA TGG CTA 2160 

Tyr Gly Ser Tyr Arg Tyr Pro Asn Trp Ser Met Val Leu Gly Trp Leu 

705 710 715 720 

ATG CTC GCC TGT TCC GTC ATC TGG ATC CCA ATT ATG TTT GTG ATA AAA 2208 

Met Leu Ala Cys Ser Val He Trp He Pro He Met Phe Val He Lys 

725 730 735 

ATG CAT CTG GCC CCT GGA AGA TTT ATT GAG AGG CTG AAG TTG GTG TGC 2256 

M t His Leu Ala Pro Gly Arg Phe He Glu Arg Leu Lys Leu Val Cys 

740 745 750 



TCG CCA CAG CCG GAC TGG GGC CCA TTC TTA GCT CAA CAC CGC GGG GAG 
Ser Pro Gin Pro Asp Trp Gly Pro Phe Leu Ala Gin His Arg Gly Glu 



2304 
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755 760 765 

CGT TAC AAG AAC ATG ATC GAC CCC TTG GGA ACC TCT TCC TTG GGA CTC 2352 
Arg Tyr Lys Asn Met lie Asp Pro Leu Gly Thr Ser Ser Leu Gly Leu 
770 775 780 

AAA CTG CCA GTG AAG GAT TTG GAA CTG GGC ACT CAG TGC TAGTCC 2397 
Lys Leu Pro Val Lys Asp Leu Glu Leu Gly Thr Gin Cys 
785 790 795 

(2) INFORMATION FOR SEQ ID NO:27: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 797 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Met Asp Cys Ser Ala Pro Lys Glu Met Asn Lys Leu Pro Ala Asn Ser 

15 10 15 

Pro Glu Ala Ala Ala Ala Gin Gly His Pro Asp Gly Pro Cys Ala Pro 

20 25 30 

Arg Thr Ser Pro Glu Gin Glu Leu Pro Ala Ala Ala Ala Pro Pro Pro 

35 40 45 

Pro Arg Val Pro Arg Ser Ala Ser Thr Gly Ala Gin Thr Phe Gin Ser 

50 55 60 

Ala Asp Ala Arg Ala Cys Glu Ala Glu Arg Pro Gly Val Gly Ser Cvs 
65 70 75 80 

Lys Leu Ser Ser Pro Arg Ala Gin Ala Ala Ser Ala Ala Leu Arq Asp 

85 " 90 95 

Leu Arg Glu Ala Gin Gly Ala Gin Ala Ser Pro Pro Pro Gly Ser Ser 

100 105 no 

Gly Pro Gly Asn Ala Leu His Cys Lys He Pro Ser Leu Arq Gly Pro 

115 120 125 

Glu Gly Asp Ala Asn Val Ser Val Gly Lys Gly Thr Leu Glu Arq Asn 

130 135 140 

Asn Thr Pro Val Val Gly Trp Val Asn Met Ser Gin Ser Thr Val Val 
145 150 155 160 

Leu Gly Thr Asp Gly He Thr Ser Val Leu Pro Gly Ser Val Ala Thr 

165 170 175 

Val Ala Thr Gin Glu Asp Glu Gin Gly Asp Glu Asn Lys Ala Arq Glv 
180 185 190 

Asn Trp Ser Ser Lys~Leu-Asp™Phe~-rie"Leu Ser Met Val Gly Tyr Ala 

195 200 205 

Val Gly Leu Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala Phe Gin Asn 

210 2T5 220 " 

Gly Gly Gly Ala Phe Leu He Pro Tyr Leu Met Met Leu Ala Leu Ala 
225 230 235 240 

Gly Leu Pro He Phe Phe Leu Glu Val Ser Leu Gly Gin Phe Ala Ser 

245 250 255 

Gin Gly Pro Val Ser Val Trp Lys Ala He Pro Ala Leu Gin Gly Cys 

260 265 270 

Gly lie Ala Met Leu He He Ser Val Leu He Ala He Tyr Tyr Asn 

275 280 285 

Val He He Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser Phe Val Ser 

290 295 300. 

Val Leu Pro Trp Gly Ser Cys Asn Asn Pro Trp Asn Thr Pro Glu Cvs 
305 310 315 320 

Lys Asp Lys Thr Lys Leu Leu Leu Asp Ser Cys Val He Ser Asp His 

325 330 335 

Pro Lys He Gin He Lys Asn Ser Thr Phe Cys Met Thr Ala Tyr Pro 

340 345 350 

Asn Val Thr Met Val Asn Phe Thr Ser Gin Ala Asn Lys Thr Phe Val 

355 360 365 

Ser Gly Ser Glu Glu Tyr Phe Lys Tyr Phe Val Leu Lys He Ser Ala 
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370 










375 










380 










Gly 


He 


Glu 


Tyr 


Pro 


Gly Glu 


He 


Arg 


Trp 


Pro 


Leu 


Ala 


Leu 


Cys 


Leu 


385 










390 










395 








400 


Phe 


Leu 


Ala 


Trp Val 


He 


Val 


Tyr Ala 


Ser 


Leu 


Ala 


Lys 


Gly 


He 


Lys 










405 










410 








415 


Thr 


Ser 


Gly 


Lys 
420 


Val 


Val 


Tyr 


Phe 


Thr 
425 


Ala 


Thr 


Phe 


Pro 


Tyr 
430 


Val 


Val 


Leu 


Val 


He 


Leu 


Leu 


He Arg 


Gly Val 


Thr 


Leu 


Pro 


Gly Ala 


Gly 


Ala 






435 










440 










445 






Gly 


He 


Trp 


Tyr 


Phe 


He 


Thr 


Pro 


Lys 


Trp 


Glu 


Lys 


Leu Thr Asp 


Ala 




450 










455 










4 60 










Thr 


Val 


Trp 


Lys 


Asp 


Ala 


Ala 


Thr 


Gin 


He 


Phe 


Phe 


Ser 


Leu 


Ser 


Ala 


4 65 










470 










475 










480 


Ala 


Trp 


Gly Gly 


Leu 


He 


Thr 


Leu 


Ser 


Ser 


Tyr 


Asn 


Lys 


Phe 


Hi <; 
nio 


Asn 










485 










490 








495 




Asn 


Cys 


Tyr Arg 


Asp 


Thr 


Leu 


He 


Val 


Thr 


Cys 


Thr 


As n 


Ser 


IV 1 a 
rVX d 


Thr 








500 










505 










510 






Ser 


He 


Phe 


Ala 


Glv 


Phe 


Val 


He 


Phe 


Ser 


Val 


He 


ox y 


r Jic 




Ala 






515 








520 










525 








Asn 


Glu 


Arg 


Lys 


Val 


Asn 


He 


Glu 


Asn 


Val 


Ala 


Asp 


Gin 


Gly 


Pro 


Gly 




530 










535 










540 






He 


Ala 


Phe 


Val 


Val 


Tvr 


Pro 


Glu 


Ala 


Leu 


Thr 


Arg 




Pro 


Leu 


Ser 


545 










550 








— 


555" 








560 


Pro 


Phe 


Trp Ala 


He 


He 


Phe 


Phe 


Leu 


Met 


Leu 


Leu 


Th r 


Leu 


Gly 


Leu 










565 










570 










575 




Asp 


Thr 


Met 


Phe 


Ala 


Thr 


He 


Glu 


Thr 


He 


Val 


Thr 


Ser 


He 


Ser 


Asp 








580 










585 










590 




Glu 


Phe 


Pro 


Lys 


Tyr 


Leu 


Arg 


Thr 


His 


Lys 


Pro 


Val 


Phe 


Thr 


Leu 


Gly 






595 










600 










605 






Cys 


Cys 


He 


Cys 


Phe 


Phe 


He 


Met 


Gly 


Phe 


Pro 


Met 


He 


Thr 


Gin 


Gly 




610 










615 










620 








Gly 


He 


Tyr 


Met 


Phe 


Gin 


Leu 


Val 


Asd 


Thr 


Tyr Ala 


Ala 


Ser 


Tyr 


Ala 


625 










630 










635 








640 


Leu 


Val 


lie 


He 


Ala 


He 


Phe 


Glu 


Leu 


Val 


Gly 


He 


Ser 


Tyr 


Val 


Tyr 










64 5 










650 








655 


Gly 


Leu 


Gin 


Arg 
660 


Phe 


Cys 


Glu 


Asp 


He 
665 


Glu 


Met 


Met 


He 


Gly 
670 


Phe 


Gin 


Pro 


Asn 


He 


Phe 


Trp 


Lys 


Val 


Cys 


Trp 


Ala 


Phe 


Val 


Thr 


Pro 


Thr 


He 






675 








680 










685 








Leu 


Thr 


Phe 


He 


Leu 


Cys 


Phe 


Ser 


Phe 


Tyr 


Gin 


Trp 


Glu 


Pro 


Met 


Thr 




690 








695 








700 










Tyr 


Gly 


Ser 


Tyr Arg 


Tyr 


Pro 


Asn 


Trp 


Ser 


Met 


Val 


Leu 


Gly Trp 


Leu 


705 










710 










71"5" 










720 


Met 


Leu 


Ala 


Cys 


Ser 


Val 


He 


Trp 


He 


Pro 


He 


Met 


Phe 


Val 


He 


Lys 










725 










730 










735 


Met 


His 


Leu 


Ala 


Pro 


Gly Arg 


Phe 


He 


Glu 


Arg 


Leu 


Lys 


Leu 


Val 


Cys 








740 










745 








750 




Ser 


Pro 


Gin 


Pro Asp 


Trp 


Gly 


Pro 


Phe 


Leu 


Ala 


Gin 


His 


Arg 


Gly 


Glu 






755 










760 










765 




Arg 


Tyr 


Lys 


Asn 


Met 


He Asp 


Pro 


Leu 


Gly Thr 


Ser 


Ser 


Leu 


Gly 


Leu 




770 










775 










780 








Lys 


Leu 


Pro 


Val 


Lys 


Asp 


Leu 


Glu 


Leu 


Gly Thr 


Gin 


Cys 








785 








790 










795 











(2) INFORMATION FOR SEQ ID NO:28: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2397 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ix) FEATURE: 

(A) NAME/ KEY: Coding Sequence 

(B) LOCATION: 1...2391 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
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ATG GAT TGC AGT GCT CCC AAG GAA ATG AAT AAA CTG CCA GCC AAC AGC 
Met Asp Cys Ser Ala Pro Lys Glu Met Asn Lys Leu Pro Ala Asn Ser 
15 10 15 



48 



240 



CCG GAG GCG GCG GCG GCG CAG GGC CAC CCG GAT GGC CCA TGC GCT CCC 96 
Pro Glu Ala Ala Ala Ala Gin Gly His Pro Asp Gly Pro Cys Ala Pro 
20 25 30 

AGG ACG AGC CCG GAG CAG GAG CTT CCC GCG GCT GCC GCC CCG CCG CCG 144 
Arg Thr Ser Pro Glu Gin Glu Leu Pro Ala Ala Ala Ala Pro Pro Pro 
35 40 45 

CCA CGT GTG CCC AGG TCC GCT TCC ACC GGC GCC CAA ACT TTC CAG TCA 10? 
Pro Arg Val Pro Arg Ser Ala Ser Thr Gly Ala Gin Thr Phe Gin Ser 
50 55 60 

GCG GAC GCG CGA GCC TGC GAG GCT GAG CGG CCA GGA GTG GGG TCT TGC 
Ala Asp Ala Arg Ala Cys Glu Ala Glu Arg Pro Gly Val Gly Ser Cvs 
65 70 75 8 6 

AAA CTC AGT AGC CCG CGG GCG CAG GCG GCC TCT GCA GCT CTG CGG GAC ?RR 
Lys Leu Ser Ser Pro Arg Ala Gin Ala Ala Ser Ala Ala Leu Arq Asd 
85 90 95* F 

TTG AGA GAG GCG CAA GGC GCG CAG GCC TCG CCC CCT CCC GGG AGC TCC 33fi 
Leu Arg Glu Ala Gin Gly Ala Gin Ala Ser Pro Pro Pro Gly Ser Ser 
100 105 no 

GGG CCC GGC AAC GCG CTG CAC TGT AAG ATC CCT TCT CTG CGA GGC CCG 
Gly Pro Gly Asn Ala Leu His Cys Lys lie Pro Ser Leu Arg Gly Pro 
115 120 125 

GAG GGG GAT GCG AAC GTG AGT GTG GGC AAG GGC ACC CTG GAG CGG AAC 4 32 

Glu Gly Asp Ala Asn Val Ser Val Gly Lys Gly Thr Leu Glu Arg Asn ' 
li0 135 140 

AAT ACC CCT GTT GTG GGC TGG GTG AAC ATG AGC CAG AGC ACC GTG GTG 
Asn Thr Pro Val Val Gly Trp Val Asn Met Ser Gin Ser Thr Val Val 
145 I 5 " 155 leo 

CTG GGC ACG GAT GGA ATC ACG TCC GTG CTC CCG GGC AGC GTG GCC ACC 528 
Leu Gly Thr Asp Gly lie Thr Ser Val Leu Pro Gly Ser Val Ala Thr 

165 _ 170 



384 



480 



576 



624 



i ( f r T ££ C £ AG ^ G GAC GAG CAA GGG GAT ^ AAT AAG GCC CGA GGG 

Va-l-Al-a-Thr Gin Glu Asp Glu Gin-Gly^p^Gair-AstrTws-Ma-Arg Glv 
180 185 190 

AAC TGG TCC AGC AAA CTG GAC TTC ATC CTG TCC ATG GTG GGG TAC GCA 
Asn Trp Ser Ser Lys Leu Asp Phe He Leu Ser Met Val Gly Tvr Ala 
195 200 205 

GTG GGG CTG GGC AAT GTC TGG AGG TTT CCC TAC CTG GCC TTC CAG AAC 672 
Val Gly Leu Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala Phe Gin Asn 

215 220 

GGG GGA GGT GCT TTC CTC ATC CCT TAC CTG ATG ATG CTG GCT CTG GCT 720 
Gly Gly Gly Ala Phe Leu lie Pro Tyr Leu Met Met Leu Ala Leu Ala 

230 235 240 

GGA TTA CCC ATC TTC TTC TTG GAG GTG TCG CTG GGC CAG TTT GCC AGC 7fift 
Gly Leu Pro lie Phe Phe Leu Glu Val Ser Leu Gly Gin Phe Ala Ser 
245 250 255 

CAG GGA CCA GTG TCT GTG TGG AAG GCC ATC CCA GCT CTA CAA GGC TGT 816 
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Gln Gly Pro Val Ser Val Trp Lys Ala lie Pro Ala Leu Gin Gly Cys 
260 265 270 

GGC ATC GCG ATG CTG ATC ATC TCT GTC CTA ATA GCC ATA TAC TAC AAT 864 
Gly lie Ala Met Leu lie He Ser Val Leu He Ala He Tyr Tyr Asn 
275 280 285 

GTG ATT ATT TGC TAT ACA CTT TTC TAC CTG TTT GCC TCC TTT GTG TCT 912 
Val He He Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser Phe Val Ser 
290 295 300 

GTA CTA CCC TGG GGC TCC TGC AAC AAC CCT TGG AAT ACG CCA GAA TGC 960 
Val Leu Pro Trp Gly Ser Cys Asn Asn Pro Trp Asn Thr Pro Glu Cys 
305 310 315 320 

AAA GAT AAA ACC AAA CTT TTA TTA GAT TCC TGT GTT ATC AGT GAC CAT 1008 
Lys Asp Lys Thr Lys Leu Leu Leu Asp Ser Cys Val He Ser Asp His 
325 330 335 

CCC AAA ATA CAG ATC AAG AAC TCG ACT TTC TGC ATG ACC GCT TAT CCC 1056 
Pro Lys He Gin He Lys Asn Ser Thr Phe Cys Met Thr Ala Tyr Pro 
340 345 350 



AAC GTG ACA ATG GTT AAT TTC ACC AGC CAG GCC AAT AAG ACA TTT GTC 1104 

Asn Val Thr Met Val Asn Phe Thr Ser Gin Ala Asn Lys Thr Phe Val 
355 360 365 

AGT GGA AGT GAG GAG TAC TTC AAG TAC TTT GTG CTG AAG ATT TCT GCA 1152 

Ser Gly Ser Glu Glu Tyr Phe Lys Tyr Phe Val Leu Lys lie Ser Ala 

370 375 380 

GGG ATT GAA TAT CCT GGC GAG ATC AGG TGG CCA CTA GCT CTC TGC CTC 1200 

Gly He Glu Tyr Pro Gly Glu He Arg Trp Pro Leu Ala Leu Cys Leu 
385 390 395 400 

TTC CTG GCT TGG GTC ATT GTG TAT GCA TCG TTG GCT AAA GGA ATC AAG 124 8 

Phe Leu Ala Trp Val He Val Tyr Ala Ser Leu Ala Lys Gly He Lys 
405 410 415 

ACT TCA GGA AAA GTG GTG TAC TTC ACG GCC ACG TTC CCG TAT GTC GTA 1296 

Thr Ser Gly Lys Val Val Tyr Phe Thr Ala Thr Phe Pro Tyr Val Val 
420 425 430 



CTC GTG ATC CTC CTC ATC CGA GGA GTC ACC CTG CCT GGA GCT GGA GCT 134 4 

Leu Val He Leu Leu He A rg Gly Val Thr Leu Pro Gly Ala Gly Ala 

43 5 " 440 445 

GGG ATC TGG TAC TTC ATC ACA CCC AAG TGG GAG AAA CTC ACG GAT GCC 1392 
Gly He Trp Tyr Phe He Thr Pro Lys Trp Glu Lys Leu Thr Asp Ala 
450 455 460 

ACG GTG TGG AAA GAT GCT GCC ACT CAG ATT TTC TTC TCT TTA TCT GCT 1440 
Thr Val Trp Lys Asp Ala Ala Thr Gin He Phe Phe Ser Leu Ser Ala 
465 470 475 480 

GCA TGG GGA GGC CTG ATC ACT CTC TCT TCT TAC AAC AAA TTC CAC AAC 14 88 
Ala Trp Gly Gly Leu He Thr Leu Ser Ser Tyr Asn Lys Phe His Asn 
485 490 \ 495 

AAC TGC TAC AGG GAC ACT CTA ATT GTC ACC TGC ACC AAC AGT GCC ACA 1536 
Asn Cys Tyr Arg Asp Thr Leu He Val Thr Cys Thr Asn Ser Ala Thr 
500 505 510 

AGC ATC TTT GCC GGC TTC GTC ATC TTC TCC GTT ATC GGC TTC ATG GCC 1584 
Ser He Phe Ala Gly Phe Val He Phe Ser Val He Gly Phe Met Ala 



WO 98/07854 



PCT/US97/14637 



-59- 

515 520 525 

AAT GAA CGC AAA GTC AAC ATT GAG AAT GTG GCA GAC CAA GGG CCA GGC 16*? 
«JJ L V S Val Asn lie Glu Asn Val Ala Asp 8Yn Gly pS G^ ?2 

535 540 J 

ATT GCA TTT GTG GTT TAC CCG GAA GCC TTA ACC AGG CTG CCT CTC TCT 16B0 
lie Ala Phe Val Val Tyr Pro Glu Ala Leu Thr Arg Leu Pro Leu ler 

ib0 • 555 56 0 

CCG TTC TGG GCC ATC ATC TTT TTC CTG ATG CTC CTC ACT CTT GGA CTT 17?fi 
Pro Phe Trp Ala lie lie Phe Phe Leu Met Leu Leu ?hr Leu Sy HI 1728 
565 570 575 



GAC ACT ATG TTT GCC ACC ATC GAG ACC ATA GTG ACC TCC ATC TCA GAP m* 

Asp Thr Met Phe Ala Thr lie Glu Thr He Val ?hr S lie S Sp ^ 
580 585 59 0 

GAG TTT CCC AAG TAC CTA CGC ACA CAC AAG CCA GTG TTT ACT CTG GGC IB?* 

Glu Phe Pro Lys Tyr Leu Arg Thr His Lys Pro Val Jhe ?£ £u 1824 



605 



TGC TGC ATT TGT TTC TTC ATC ATG GGT TTT CCA ATG ATC APT r«r rr,. no-,-, 
Cys Cys lie Cys Phe Phe lie Met Gly III S 35 cfy 1872 

blu 615 620 

GGA ATT TAC ATG TTT CAG CTT GTG GAC ACC TAT GCT GCC TCC TAT GCC 
Gly lie Tyr Met Phe Gin Leu Val Asp Thr Tyr Ala Ala Ser lyl S 
fa " 630 635 640 

CTT GTC ATC ATT GCC ATT TTT GAG CTC GTG GGG ATC TCT TAT GTG TAT 
Leu Val He He Ala lie Phe Glu Leu Val Gly III III ™ r vll Tyl 
645 650 , 655 

GGC TTG CAA AGA TTC TGT GAA GAT ATA GAG ATG ATG ATT GGA TTC CAG 
Gly Leu Gin Arg Phe Cys Glu Asp lie Glu Met Set III |ly III c?n 

CCT AAC ATC TTC TGG AAA GTC TGC TGG GCA TTT GTA ACC CCA ACT ATT 
Pro Asn lie Phe Trp Lys Val Cys Trp Ala p£e Val ?£r pS J£ He 

0/0 680 68 5 

U» tk° ll T ~^ C -CTT-TGC TTC AGC TTT TAC CAG TGG GAG CCC" ATG - ACC™ 
Leu Thr Phe lie Leu Cys Phe Ser Phe Tyr Gin Trp OlZ Pro Met Thr 



1920 



1968 



2016 



2064 



2112 



TAT GGC TCT TAC CGC TAT CCT AAC TGG TCC ATG GTG CTC GGA TGG CTA 
Tyr Gly Ser Tyr Arg Tyr Pro Asn Trp Ser Met vll Leu GlJ ?rp 21 

liV 715 720 

ATG CTC GCC TGT TCC GTC ATC TGG ATC CCA ATT ATG TTT CTr at* hls 
Met Leu Ala Cys Ser Val lie Trp lie Pro III {£? 52 vll lli J£ 

730 735 

ATG CAT CTG GCC CCT GGA AGA TTT ATT GAG AGG CTG AAG TTG GTr Trr 
Met «» Leu Ala Pro Giy Arg Phe „ Glu Arg JJ| JJJ Leu vTl lyl 

745 750 

TCG CCA CAG CCG GAC TGG GGC CCA TTC TTA GCT CAA CAC CGC GGG CAP 
Ser Pro Gin Pro Asp Trp Gly Pro Phe Leu Ala G^ £s Arg Gly Su 

760 765 

CGT TAC AAG AAC ATG ATC GAC CCC TTG GGA ACC TCT TCC TTG GGA CTr 
Arg Tyr Lys Asn Met lie Asp Pro Leu Gly Thr Ser Kr Su gj 2S 

fU 7 '5 780 



2160 



2208 



2256 



2304 



2352 
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AAA CTG CCA GTG AAG GAT TTG GAA CTG GGA ACG CAA TGC TAATCC 2397 
Lys Leu Pro Val Lys Asp Leu Glu Leu Gly Thr Gin Cys 
785 790 795 

(2) INFORMATION FOR SEQ ID NO: 29: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 949 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

AGGCGCAAGG CGCGCAGGCC TCGCCCCCTC CCGGGAGCTC CGGGCCCGGC AACGCGTTGC 60 

ACTGTAAGAT CCCTTCTCTG CGAGGCCCGG AGGGGGATGC GAACGTGAGT GTGGGCAAGG 120 

GCACCCTGGA GCGGAACAAT ACCCCTGTTG TGGGCTGGGT GAACATGAGC CAGAGCACCG 180 

TGGTGCTGGG CACGGATGGA ATCACGTCCG TGCTCCCGGG CAGCGTGGCC ACCGTTGCCA 24 0 

CCCAGGAGGA CGAGCAAGGG GATGAGAATA AGGCCCGAGG GAACTGGTCC AGCAAACTGG 300 

ACTTCATCCT GTCCATGGTG GGGTACGCAG TGGGGCTGGG CAATGTCTGG AGGTTTCCCT 360 

ACCTGGCCTT CCAGAACGGG GGAGGTGCTT TCCTCATCCC TTACCTGATG ATGCTGGCTC 420 

TGGCTGGATT ACCCATCTTC TTCTTGGAGG TGTCGCTGGG CCAGTTTGCC AGCCAGGGAC 4 80 

CAGTGTCTGT GTGGAAGGCC ATCCCAGCTC TACAAGGCTG TGGCATCGCG ATGCTGATCA 54 0 

TCTCTGTCCT AATAGCCATA TACTACAATG TGATTATTTG CTATACACTT TTCTACCTGT 600 

TTGCCTCCTT TGTGTCTGTA CTACCCTGGG GCTCCTGCAA CAACCCTTGG AATACACCAG 660 

AATGCAAAGA TAAAACCAAA CTTTTATTAG ATTCCTGTGT TATCAGTGAC CATCCCAAAA 72 0 

TACAGATCAA GAACTCGACT TTCTGCATGA CCGCTTATCC CAACGTGACA ATGGTTAATT 780 

TCACCAGCCA GGCCAATAAG ACATTTGTCA GTGGAAGTGA AGAGTACTTC AAGTACTTTG 84 0 

TGCTGAAGAT TTCTGCAGGG ATTGAATATC CTGGCGAGAT CAGGTGGCCA CTAGCTCTCT 900 

GCCTCTTCCT GGCTTGGGTC ATTGTGTATG CATCGTTGGC TAAAGGAAT 94 9 

(2) INFORMATION FOR SEQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Pro Gly Ser Ser Gly Pro Gly 
10 15 
Leu Arg Gly Pro Glu Gly Asp 
30 

Leu Glu Arg Asn Asn Thr Pro 
45 

Ser Th r Val Val Leu Gly Thr 

60 

Ser Val Ala Thr Val Ala Thr 

75 80 
Lys Ala Arg Gly Asn Trp Ser 
90 95 
Val Gly Tyr Ala Val Gly Leu 
110 

Ala Phe Gin Asn Gly Gly Gly 
125 

Leu Ala Leu Ala Gly Leu Pro 
140 

Gin Phe Ala Ser Gin Gly Pro 
155 160 
Leu Gin Gly Cys Gly lie Ala 
170 175 
He Tyr Tyr Asn Val He lie 
190 

S r Phe Val Ser Val Leu Pro 
205 



Ala 


Gin 


Gly Ala 


Gin 
5 


Ala 


Ser 


Pro 


Pro 


1 

Asn 


Ala 


Leu His 


Cys 


Lys 


He 


Pro 


Ser 






20 








25 


Ala 


Asn 


Val Ser 


Val 


Gly 


Lys 


Gly'Thr 






35 








40 




Val 


Val 


Gly Trp 


Val 


Asn 


Met 


Ser 


Gin 




50 






55 






Asp Gly 


He Thr 


Ser 


Val 


Leu 


Pro 


Gly 


65 








70 








Gin Glu Asp Glu 


Gin 


Gly 


Asp 


Glu 


Asn 








85 










Ser 


Lys 


Leu Asp 


Phe 


He 


Leu 


Ser 


Met 




100 










105 


Gly 


Asn 


Val Trp 


Arg 


Phe 


Pro 


Tyr 


Leu 




115 








120 




Ala 


Phe 


Leu He 


Pro 


Tyr 


Leu 


Met 


Met 




130 






135 






He 


Phe 


Phe Leu 


Glu 


Val 


Ser 


Leu 


Gly 


145 








150 






Ala 


Val 


Ser 


Val Trp 


Lys 


Ala 


He 


Pro 






165 










Met 


Leu 


He He 


Ser 


Val 


Leu 


He 


Ala 






180 










185 


Cys 


Tyr Thr Leu 


Phe 


Tyr 


Leu 


Phe 


Ala 




195 








200 




Trp 


Gly 


Ser Cys 


Asn 


Asn 


Pro 


Trp 


Asn 
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210 








215 










220 










Thr 


Lys 


Leu 


Leu 


Leu Asp 


Ser 


Cys 


Val 


lie 


Ser 


Asp 


His 


Pro 


Lys 


He 


225 








230 










235 






240 


Gin 


lie 


Lys 


Asn 


Ser Thr 


Phe 


Cys 


Met 


Thr 


Ala 


Tyr 


Pro 


Asn 


Val 


Thr 










245 








250 








255 




Met 


Val 


Asn 


Phe 


Thr Ser 


Gin 


Ala 


Asn 


Lys 


Thr 


Phe 


Val 


Ser Gly Ser 








260 








265 










270 






Glu 


Glu 


Tyr 


Phe 


Lys Tyr 


Phe 


Val 


Leu 


Lys 


He 


Ser 


Ala 


Gly 


He 


Glu 






275 








280 








285 






Tyr 


Pro 


Gly 


Glu 


lie Arg Trp 


Pro 


Leu 


Ala 


Leu 


Cys 


Leu 


Phe 


Leu 


Ala 




290 








295 










300 










Trp 


Val 


lie 


Val 


Tyr Ala 


Ser 


Leu 


Ala 


Lys 


Gly 












305 








310 








315 













(2) INFORMATION FOR SEQ ID NO: 31: 
U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 949 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

AGGCGCAAGG CGCGCAGGCC TCGCCCCCTC CCGGGAGCTC CGGGCCCGGC AACGCGTTGC 60 

ACTGTAAGAT CCCTTCTCTG CGAGGCCCGG AGGGGGATGC GAACGTGAGT GTGGGCAAGG 120 

GCACCCTGGA GCGGAACAAT ACCCCTGTTG TGGGCTGGGT GAACATGAGC CAGAGCACCG 180 

TGGTGCTGGG CACGGATGGA ATCACGTCCG TGCTCCCGGG CAGCGTGGCC ACCGTTGCCA 240 

CCCAGGAGGA CGAGCAAGGG GATGAGAATA AGGCCCGAGG GAACTGGTCC AGCAAACTGG 300 

ACTTCATCCT GTCCATGGTG GGGTACGCAG TGGGGCTGGG CAATGTCTGG AGGTTTCCCT 360 

ACCTGGCCTT CCAGAACGGG GGAGGTGCTT TCCTCATCCC TTACCTGATG ATGCTGGCTC 420 

TGGCTGGATT ACCCATCCTC TTCTTGGAGG" TGTCGCTGGG CCAGTTTGCC AGCCAGGGAC 480 

CAGTGTCTGT GTGGAAGGCC ATCCCAGCTC TACAAGGCTG TGGCATCGCG ATGCTGATCA 54 0 

TCTCTGTCCT AATAGCCATA TACTACAATG TGATTATTTG CTATACACTT TTCTACCTGT 600 

TTGCCTCCTT TGTGTCTGTA CTACCCTGGG GCTCCTGCAA CAACCCTTGG AATACACCAG 660 

AATGCAAAGA TAAAACCAAA CTTTTATTAG ATTCCTGTGT TATCAGTGAC CATCCCAAAA 720 

TACAGATCAA GAACTCGACT TTCTGCATGA CCGCTTATCC CAACGTGACA ATGGTTAATT 780 

TCACCAGCCA GGCCAATAAG ATATTTGTCA GTGGAAGTGA AGAGTACTTC AAGTACTTTG 840 

TGCTGAAGAT TTCTGCAGGG ATTGAATATC CTGGCGAGAT CAGGTGGCCA CTAGCTCTCT 900 

GCCTCTTCCT GGCTTGGGTC ATTGTGTATG CATCGTTGGC TAAAGGAAT 949 

(2) INFORMATION FOR SEQ ID NO: 32: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid- 



(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION":— SEQ-TD-NOT32r 



Ala 


Gin Gly 


Ala 


Gin 


Ala 


Ser 


Pro 


Pro 


Pro Gly 


1 






5 










10 


Asn 


Ala Leu 


His 


Cys 


Lys 


He 


Pro 


Ser 


Leu Arg 






20 










25 


Ala 


Asn Val 


Ser 


Val 


Gly 


Lys 


Gly 


Thr 


Leu Glu 




35 






40 






Val 


Val Gly 


Trp 


Val 


Asn 


Met 


Ser 


Gin 


Ser Thr 




50 






55 








Asp 


Gly He. 


Thr 


Ser 


Val 


Leu 


Pro 


Gly 


Ser Val 


65 








70 






75 


Gin 


Glu Asp 


Glu 


Gin 
85 


Gly 


Asp 


Glu 


Asn 


Lys Ala 
90 


Ser 


Lys Leu 


Asp 


Phe 


He 


Leu 


Ser 


Met 


Val Gly 






100 










105 


Gly 


Asn Val 
115 


Trp 


Arg 


Phe 


Pro 


Tyr 
120 


Leu 


Ala Phe 


Ala 


Phe Leu 
130 


He 


Pro 


Tyr 


Leu 
135 


Met 


Met 


Leu Ala 



15 

Gly Pro Glu Gly Asi 
30 

Arg Asn Asn Thr Pr( 
45 

Val Val Leu Gly Thj 
60 

Ala Thr Val Ala Th] 
80 

Arg Gly Asn Trp Se] 
95 

Tyr Ala Val Gly Lei 
110 

Gin Asn Gly Gly Gl\ 
125 

Leu Ala Gly Leu Pre 
140 
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He 


Leu 


Phe 


Leu 


Glu 


Val 


Ser 


Leu 


Gly 


Gin 


Phe 


Ala 


Ser 


Gin 


Gly 


Pro 


145 










150 










155 








1 60 


Val 


Ser 


Val 


Trp 


Lys 


Ala 


He 


Pro 


Ala 


Leu 


Gin 


Gly Cys 


Gly 


He 


Ala 










165 










170 








175 




Met 


Leu 


He 


He 


Ser 


Val 


Leu 


He 


Ala 


He 


Tyr 


Tyr Asn 


Val 


He 


He 








180 










185 








190 






Cys 


Tyr 


Thr 


Leu 


Phe 


Tyr 


Leu 


Phe 


Ala 


Ser 


Phe 


va x 


i>er 


Val 


Leu 


Pro 




Gly 
210 


195 










200 










205 




Trp 


Ser 


Cys 


Asn 


Asn 


Pro 
215 


Trp 


Asn 


Thr 


Pro 


Glu 
220 


Cys 


Lys 


Asp 


Lys 


Thr 


Lys 


Leu 


Leu 


Leu 


Asp 

230 


Ser 


Cys 


Val 


He 


Ser 


Asp 


His 


Prr* 


Lys 


Tift 

ixe 


225 


He 
















235 








Gin 


Lys 


Asn 


Ser 


Thr 


Phe 


Cys 


Met 


Thr 


Ala 


Tyr 


Pro 


Asn 


Val 


Thr 




Val 






245 










250 








255 


Met 


Asn 


Phe 


Thr 


Ser 


Gin 


Ala 


Asn 


Lys 


He 


Phe 


Val 


Ser 


Gly 


Ser 


Glu 


Glu 




260 










265 








270 


Tyr 


Phe 


Lys 


Tyr 


Phe 


Val 


Leu 


Lys 


He 


Ser 


Ala 


Gly 


He 


Glu 






275 










280 










285 




Tyr 


Pro Gly 


Glu 


lie 


Arg 


Trp 


Pro 


Leu 


Ala 


Leu 


Cys 


Leu 


Phe 


Leu 


Ala 




290 










295 










300 






Trp 


Val 


lie 


Val 


Tyr 


Ala 


Ser 


Leu 


Ala 


Lys 


Gly 












305 










310 








315 













(2) INFORMATION FOR SEQ ID NO: 33: 
(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 949 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

AGGCGCAAGG CGCGCAGGCC TCGCCCCCTC CCGGGAGCTC CGGGCCGGGC AACGCGCTGC 60 

ACTGTAAGAT CCCTTCTCTG CGAGGCCCGG AGGGGGATGC GAACGTGAGT GTGGGCAAGG 120 

GCACCCTGGA GCGGAACAAT ACCCCTGTTG TGGGCTGGGT GAACATGAGC CAGAGCACCG 18 0 

TGGTGCTGGG CACGGATGGA ATCACGTCCG TGCTCCCGGG CAGCGTGGCC ACCGTTGCCA 24 0 

CCCAGGAGGA CGAGCAAGGG GATGAGAATA AGGCCCGAGG GAACTGGTCC AGCAAACTGG 300 

ACTTCATCCT GTCCATGGTG GGGTACGCAG TGGGGCTGGG CAATGTCTGG AGGTTTCCCT 360 

ACCTGGCCTT CCAGAACGGG GGAGGTGCTT TCCTCATCCC TTACCTGATG ATGCTGGCTC 420 

TGGCTGGATT ACCCATCTTC TTCTTGGAGG TGTCGCTGGG CCAGTTTGCC AGCCAGGGAC 480 

CGGTGTCTGT GTGGAAGGCC ATCCCAGCTC TACAAGGCTG TGGCATCGCG ATGCTGATCA 54 0 

TCTCTGTCCT AATAGCCATA TACTACAATG TGATTATTTG CTATACACTT TTCTACCTGT 600 

TTGCCTCCTT TGTGTCTGTA CTACCCTGGG GCTCCTGCAA CAACCCTTGG AATACGCCAG 660 

-AATGCAAAGA TAAAACCAAA CTTTTATTAG ATTCCTGTGT "TATCAGTGAC CATCCCAAAA 720 

TACAGATCAA GAACTCGACT TTCTGCATGA CCGCTTATCC CAACGTGACA ATGGTTAATT 780 

TCACCAGCC A GGCCAATAAG ACATTTGTCA GTGGAAGTGA_AGAGT'ACTXC_AAGTACTTTG 840 

TGCTGAAGAT TTCTGCAGGG ATTGAATATC CTGGCGAGAT CAGGTGGCCA CTAGCTCTCT 900 

GCCCCTTCCT GGCTTGGGTC ATTGTGTATG CATCGTTGGC TAAAGGAAT 949 

(2) INFORMATION FOR SEQ ID NO: 34: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 





(xi) SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 34: 






Ala 


Gin 


Gly Ala Gin 


Ala Ser 


Pro 


Pro 


Pro 


Gly Ser 


Ser 


Gly 


1 




5 








10 




Asn 


Ala 


Leu His Cys 


Lys He 


Pro 


Ser 


Leu 


Arg Gly 


Pro 


Glu 






20 






25 






30 


Ala 


Asn 


Val Ser Val 


Gly Lys 


Gly Thr 


Leu 


Glu Arg 


Asn 


Asn 






35 




40 






45 




Val 


Val 


Gly Trp Val 


Asn Met 


Ser 


Gin 


Ser 


Thr Val 


Val 


Leu 




50 




55 








60 




Asp 


Gly 


He Thr Ser 


Val Leu 


Pro 


Gly 


Ser 


Val Ala 


Thr 


Val 
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65 70 75 80 

Gin Glu Asp Glu Gin Gly Asp Glu Asn Lys Ala Arg Gly Asn Trp Ser 

85 90 95 

Ser Lys Leu Asp Phe He Leu Ser Met Val Gly Tyr Ala Val Gly Leu 

100 105 no 

Gly Asn Val Trp Arg Phe Pro Tyr Leu Ala Phe Gin Asn Gly Gly Glv 

115 120 125 

Ala Phe Leu He Pro Tyr Leu Met Met Leu Ala Leu Ala Gly Leu Pro 

130 135 140 

He Phe Phe Leu Glu Val Ser Leu Gly Gin Phe Ala Ser Gin Gly Pro 
145 150 155 160 

Val Ser Val Trp Lys Ala He Pro Ala Leu Gin Gly Cys Gly He Ala 

165 170 175 

Met Leu He He Ser Val Leu He Ala He Tyr Tyr Asn Val He He 

180 185 190 

Cys Tyr Thr Leu Phe Tyr Leu Phe Ala Ser Phe Val Ser Val Leu Pro 

195 200 205 

Trp Gly Ser Cys Asn Asn Pro Trp Asn Thr Pro Glu Cys Lys Asp Lvs 

210 215 220 

Thr Lys Leu Leu Leu Asp Ser Cys Val He Ser Asp His Pro Lys He 
225 230 235 240 

Gin He Lys Asn Ser Thr Phe Cys Met Thr Ala Tyr Pro Asn Val Thr 

- - 245 250 - 255 

Met Val Asn Phe Thr Ser Gin Ala Asn Lys Thr Phe Val Ser Gly Ser 

260 265 270 

Glu Glu Tyr Phe Lys Tyr Phe Val Leu Lys He Ser Ala Gly He Glu 

275 280 285 

Tyr Pro Gly Glu He Arg Trp Pro Leu Ala Leu Cys Pro Phe Leu Ala 
2 90 295 300 

Trp" Val lie Val Tyr Ala Ser Leu Ala Lys Gly ~ " ' 

305 310 315 

(2) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1303 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

AGGCGCAAAG CGCGCAGGCC TCGCCCCCTC CCGGGAGCTC CGGGCCCGGC AACGCGCTGC 60 

ACTGTAAGAT CCCTTCTCTG CGAGGCCCGG AGGGGGATGC GAACGTGAGT GTGGGCAAGG 120 

- GCACCCTGGA- GCGGAACAAT ACCCCTGTTG TGGGCTGGGT GAACATGAGC~CAGAGCACCG 180 

TGGTGCTGGG CACGGATGGA ATCACGTCCG TGCTCCCGGG CAGCGTGGCC ACCGTTGCCA 240 

CCCAGGAGGA CGAGCAAGGG GATGAGAATA AGGCCCGAGG GAACTGGTCC AGCAAACTGG 300 

ACTTCATCCT - GTCCATGGTG GGGTACGCAG TGGGGCTGGG~CAKTGTCTGG~AGGTTTCCCT~ 360 

ACCTGGCCTT CCAGAACGGG GGAGGTGCTT TCCTCATCCC TTACCTGATG ATGCTGGCTC 420 

TGGCTGGATT ACCCATCTTC TTCTTGGAGG TGTCGCTGGG CCAGTTTGCC AGCCAGGGAC 4 80 

CAGTGTCTGT GTGGAAGGCC ATCCCAGCTC TACAAGGCTG TGGCATCGCG ATGCTGATCA 540 

TCTCTGTCCT AATAGCCATA TACTACAATG TGATTATTTG CTATACACTT TTCTACCTGT 600 

TTGCCTCCTT TGTGTCTCTA CTACCCTGGG GCTCCTGCAA CAACCCTTGG StACGCcS 660 

AATGCAAAGA TAAAACCAAA CTTTTATTAG ATTCCTGTGT TATCAGTGAC CATCCCAAAA 720 

TACAGATCAA GAACTCGACT TTCTGCATGA CCGCTTATCC CAACGTGACA ATGGTTAATT 780 

^ CCAATAAG ACA TTTGTCA GTGGAAGTGA AGAGTACTTC AAGTACTTTG 840 

TGCTGAAGAT TTCTGCAGGG ATTGAATATC CTGGCGAGAT CAGGTGGCCA CTAGCTCTCT 900 

GCCTCTTCCT GGCTTGGGTC ATTGTGTATG CATCGTTGGC TAAAGGAATC AAGACTTCAG 960 

GAAAAGTGGT GTACTTCACG GCCACGTTCC CGTATGTCGT ACTCGTGATC CTCCTCATCC 1020 

CCTGCCTGGA GCTGGAGCTG GGATCTGGTA CTTCATCACA CCCAAGTGGG 1080 

AGAAACTCAC GGATGCCACG GTGTGGAAAG ATGCTGCCAC TCAGATTTTC TTCTCTTTAT 1140 

CTGCTGCATG GGGAGGCCTG ATCACTCTCT CTTCTTACAA CAAATTCCAC AACAACTGCT 1200 

ACAGGGACAC TCTAATTGTC ACCTGCACCA ACAGTGCCAC AAGCATCTTT GCCGGCTTCG 1260 

TCATCTTCTC CGTTATCGGC TTCATGGCCA ATGAACGCAA AGT GCCGGCTTCG 1260 

(2) INFORMATION FOR SEQ ID NO: 36: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 4 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Ala Gin Ser Ala Gin Ala Ser Pro Pro Pro Gly Ser Ser Gly Pro Glv 

1 5 10 15 

Asn Ala Leu His Cys Lys He Pro Ser Leu Arg Gly Pro Glu Gly Asp 

20 25 30 

Ala Asn Val Ser Val Gly Lys Gly Thr Leu Glu Arg Asn Asn Thr Pro 
35 40 45 

Thr Val Val Leu Gly Thr 
60 

Val Ala Thr Val Ala Thr 
75 80 
Ala Arg Gly Asn Trp Ser 
95 

Gly Tyr Ala Val Gly Leu 
110 

Phe Gin Asn Gly Gly Gly 
125 

Ala Leu Ala Gly Leu Pro 
140 

Phe Ala Ser Gin Gly Pro 
!55 160 
Gin Gly Cys Gly He Ala 
175 

Tyr Tyr Asn Val lie . lie 
190 

Phe Val Ser Leu Leu Pro 
205 

Pro Glu Cys Lys Asp Lys 
220 

Ser Asp His Pro Lys He 
235 240 
Ala Tyr Pro Asn Val Thr 
255 

Thr Phe Val Ser Gly Ser 
270 

He Ser Ala Gly He Glu 
285 

Leu Cys Leu Phe" "Leu "Ala" 
300 





50 










55 








Asp 


Gly 


He 


Thr 


Ser 


Val 


Leu 


Pro 


Gly 


Ser 


65 










70 








Gin 


Glu 


Asp 


Glu 


Gin 


Gly Asp 


Glu 


Asn 


Lys 










85 










90 


Ser 


Lys 


Leu 


Asp 
100 


Phe 


He 


Leu 


Ser 


Met 
105 


Val 


Gly Asn Val 


Trp Arg 


Phe 


Pro 


Tyr 


Leu 


Ala 






115 










120 






Ala 


Phe 


Leu 


He- 


Pro 


Tyr 


Leu 


Met 


Met 


Leu 




130 








135 








He 


Phe 


Phe 


Leu 


Glu 


Val 


Ser 


Leu 


Gly 


Gin 


145 










150 










Val 


Ser 


Val 


Trp 


Lys 
165 


Ala 


He 


Pro 


Ala 


Leu 
170 


Met 


Leu 


He 


He 


Ser 


Val 


Leu 


He 


Ala 


lie 






Thr 


180 










185 




Cys 


Tyr 


Leu 


Phe 


Tyr 


Leu 


Phe 


Ala 


Ser 






195 








200 






Trp 


Gly Ser 


Cys 


Asn 


Asn 


Pro 


Trp Asn Thr 




210 










215 








Thr 


Lys 


Leu 


Leu 


Leu 


Asp 


Ser 


Cys 


Val 


He 


225 










230 








Gin 


He 


Lys 


Asn 


Ser 


Thr 


Phe 


Cys 


Met 


Thr 










245 








250 


Met 


Val 


Asn 


Phe 


Thr 


Ser 


Gin 


Ala 


Asn 


Lys 








260 










265 


Glu 


Glu 


Tyr 


Phe 


Lys 


Tyr 


Phe 


Val 


Leu 


Lys 






275 










280 




Tyr 


Pro 


Gly- Glu-rie~Arg Trp Pro Leu Ala 




290 










295 








Trp 
305 


Val 


He 


Val 


Tyr Ala 


Ser 


Leu 


Ala 


Lys 










310 








Lys Val 


Val 


Tyr 


Phe 


Thr 


Ala 


Thr 


Phe 


Pro 










325 










330 


Leu 


Leu 


He 


Arg 


Gly 


Val 


Thr 


Leu 


Pro 


Gly 








340 










345 


Tyr 


Phe 


He 


Thr 


Pro 


Lys 


Trp 


Glu 


Lys 


Leu 






355 










360 




Lys 


Asp 


Ala 


Ala 


Thr 


Gin 


He 


Phe 


Phe 


Ser 




370 










375 






Gly Leu 


He 


Thr 


Leu 


Ser 


Ser 


Tyr Asn 


Lys 


385 










390 








Arg Asp 


Thr 


Leu 


He 


Val 


Thr 


Cys 


Thr 


Asn 


Ala 








405 








410 


Gly 


Phe 


Val 


He 


Phe 


Ser 


Val 


He Gly 








420 










425 





Lys 



315 320 
Tyr Val Val Leu Val He 
335 

Ala Gly Ala Gly He Trp 
350 

Thr Asp Ala Thr Val Trp 
365 

Leu Ser Ala Ala Trp Gly 
380 

Phe His Asn Asn Cys Tyr 
395 400 
Ser Ala Thr Ser He Phe 
415 

Phe Met Ala Asn Glu Arg 
430 



(2) INFORMATION FOR SEQ ID NO: 37: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 

CCNAARGARA TGAAYAARCC NCC 

(2) INFORMATION FOR SEQ ID NO: 38: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

GCNGTGAAGT ACACCACTTT NCC 

{2) INFORMATION FOR SEQ ID NO: 39: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single- 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 

CCNAARGARA TGAAYAARCC NCC 

(2) INFORMATION FOR SEQ ID NO: 40: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40: 

GGCYTCNGGG TAARCCACRA ANGC 

(2) INFORMATION FOR SEQ ID NO: 41: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS-: — single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



CGGTTCAATC TGTTGTCCGC ATCAGACATG 

(2) INFORMATION FOR SEQ ID NO: 42: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GCAGGCTCGC GCGTCCGCTG 

(2) INFORMATION FOR SEQ ID NO: 43: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
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CCCGTATGTC GTACTCGTGA TCCTCCTCAT CCG 

(2) INFORMATION FOR SEQ ID NO: 44: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

CCNCCRTGNG TDATCATNGG RAANCCC 

(2) INFORMATION FOR SEQ ID NO: 45: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

CCA^CACAC TACTGGAYYA RCAYTGNGTN CC 

(2) INFORMATION FOR SEQ ID NO: 46: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 46: 

CAGATTTCCT TCTCTTTATC TGCTGCATGG 

(2) INFORMATION FOR SEQ ID NO: 47: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

GGRTCDATCA TRTTYTTRTA NCKYTCNCC 

(2) INFORMATION FOR SEQ ID W6:A~Q~: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base„_p_aixs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

CCTGCACCAA CAGTGCCACA AGC 

(2) INFORMATION FOR SEQ ID NO: 49: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

CCAAGTACCT ACGCACACAC AAGCC 

(2) INFORMATION FOR SEQ ID NO:50: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 28 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GGATTAATAC GGGACCATCC ACACTACT 

(2) INFORMATION FOR SEQ ID NO: 51: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

AGCTCTGCGG GACTTGAGAG 

(2) INFORMATION FOR SEQ ID NO: 52: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 25 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

GTACACCACT TTTCCTGAAG TCTTG 

(2) INFORMATION FOR SEQ ID NO: 53: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

CCTTGGTCTG CCACATTCTC AATGTTG 



In summary, the sequences of the Sequences Listing are as follows: 



5 



10 



SEQ ID 


Type 


Sequence 


Corres. Clone 


1 


N.A. 


nt 1-1 90 


phG2-3-a 


2 


"Protein 


aa 1-63 — 




3 


N.A. 


nt 1-190 


phG2-3-b 


4 


Protein 


aa 1-63 




5 


N.A. 


nt 39-1254 


phG2-l 


6 


Protein 


aa 14-418 




7 


N.A. 


nt 39-1635 


phG2-2 


8 


Protein 


aa 14-190 




9 


Protein 


aa 192-545 




10 


N.A. 


nt 1317-1847 


phG2-4-a 


11 


Pr tein 


aa 440-615 
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SEQ ID 


Type 


Sequence 


Corrcs. Clone 


12 


N.A. 


nt 1317-1847 


phG2-4-b 


13 


Protein 


aa 440-615 




14 


N.A. 


nt 1540-2379 


phG2-7-a 


15 


Protein 


aa 514-793 




16 


N.A. 


nl 1540-2379 


phG2-7-b 


17 


Protein 


aa 514-793 




18 


N.A. 


nt 1-2397 




19 


Protein 


aa 1-797 




20 


N.A. 


nt 1-2397 


pHGT2-a 


21 


Protein 


aa 1-797 




22 


N.A. 


nt 1809-2397 


phG2-8-a 


23 


Protein 


aa 604-797 




24 


N.A. 


nt 1809-2397 


phG2-8-b 


25 


Protein 


aa 604-797 




26 


N.A. 


nl 1-2397 




27 


Protein 


aa 1-797 




28 


N.A. 


nt 1-2397 


pHGT2-b" 


29 


N.A. 


nt 296-1244 


phG2-9-a 




Protein 


aa 100-414" 




~ 30— 





31 


N.A. 


nt 296-1244 


phG?-9-b 


32 


Protein 


aa 100-414 




33 


N.A. 


nt 296-1244 


phG2-9-c 


34 


Protein 


aa 100-414 




35 


N.A. 


nt 296-1598 


phG2-10 


36 


Protein 


aa 100-532 





SEQ ID NO:28 encodes the same protein as SEQ ID NO: 26 ? though with 
somewhat different codon usage. 



30 



The nucleic acid sequences described herein, and consequently the protein 
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sequcnccs derived therefrom, have been carefully sequenced. However, those of ordinary 
skill will recognize that nucleic acid sequencing technology can be susceptablc to some 
error. Those of ordinary skill in the relevant arts arc capable of validating or correcting 
these sequences based on the ample description herein of methods of isolating the nucleic 
acid sequences in question, and such modifications that are made readily available by the 
present disclosure arc encompassed by the present invention. Furthermore, those 
sequences reported herein are within the invention whether or not later clarifying studies 
identify sequencing errors. 

While this invention has been described with an emphasis upon preferred 
embodiments, it will be obvious to those of ordinary skill in the art that variations in the 
preferred devices and methods may be used and that it is intended that the invention may 
bc PI^ 1 ^^ 01 ^^' 56 than as specifically described herein. Accordingly, this invention 
includes all modifications encompassed within the spirit and scope of the invention as 
defined by the claims that follow. 
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What is claimed: 

1. An recombinant nucleic acid encoding a glycine transporter having at 
least about 96% sequence identity with a reference sequence which is the protein 
sequence of SEQ ID NO:27 or with a sequence corresponding to the protein sequence of 
5 SEQ ID NO:27 except that it has one or more of the following amino acid substitutions 
(1) Gly 102 to Ser, (2) Ser' 24 to Phc, (3) lie 279 to Asn. (4) Arg 393 to Gly, (5) Lys 437 to 
Asn : (6) Asp 463 to Asn, (7) Cys 610 to Tyr, (8) lie 611 to Val, (9) Phc 733 to Ser, (10) lie 735 
to Val, (II) Phe 245 to Leu, (12) Val 305 to Leu, (13) Thr 366 to lie or (14) Leu 400 to Pro. 

10 2 - Thc nucleic acid of claim 1, wherein the reference sequence is the protein 

sequence of SEQ ID NO:27 or with a sequence corresponding to thc protein sequence of 
SEQ ID NO:27 except that it has one or more of the following amino acid substitutions 
(I) Gly 102 to Ser s (2) Ser 124 to Phe, (3) He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to 
Asn, (6) Asp 463 to Asn, (7) Cys 610 to Tyr, (8) He 611 to Val, (9) Phc 733 to Ser or (10) 

15 He 735 to Val. 

3. The nucleic acid of claim 1. wherein said sequence identity is at least 
about 97%. 



4. Thc nucleic acid of claim 1, wherein said sequence identity is at least 
about 98%. 



5. The nucleic acid of claim 1, wherein thc nucleic acid encodes a glycine 
t ran sp orter h a v i ng~th"e~refefence sequencer - 

25 

6. The nucleic acid of claim 1 , comprising the nucleic acid sequence of 
SEQ ID NO:26 or with a sequence that varies from the nucleic acid sequence of SEQ ID 
NO:26 by having one or more of thc following nucleotide substitutions (a) T 6 to C. (b) 
G 304 to A, (c) C 371 to T, (d) C 571 to T, (e) T 836 to A, (0 A H16 to G, (g) A 1177 to G, (h) 

30 G 1371 to C, (i) G 1387 to A, 0) G 1829 to A, (k) A 1831 to G, (I) G 2103 to A, (m) T 2,9 « to C, 
(n) A 2203 to G, (o) C 342 to G, (p) C 352 to T, (q) T 733 to C, (r) A 777 to G, (s) G 913 to C, (t) 
G 951 to A, (u) C 1097 to T or (v) T n " to C. 
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7. The nucleic acid of claim 1, comprising the nucleic acid sequence of 
SEQ ID NO:26 or with a sequence that varies from the nucleic acid sequence of SEQ ID 
NO:26 by having one or more of the following nucleotide substitutions (a) T 6 to C, (b) 
G 3W to A, (c) C 37 ' to T, (d) C" 1 to T, (e) T 836 to A, (f) A»> 6 ,o G, (g) a" 77 ,o G, (h) 
G'"' to C, (i) G 1387 to A, 0) G 11,29 to A, (k) A im to G, (I) G 2 "> 3 to A, (m) T 2 ' 98 to C 
or (n) A 22 " to G. 



10 



8. A vector comprising the nucleic acid of claim 1 and an extrinsic promoter 
functionally associated therewith. 

9. A nucleic acid encoding a glycine transporter protein having at least 
about 99.5% sequence identity with all or one to two contiguous portions of a reference . 
amino acid sequence, wherein the refenence sequence is SEQ ID N0.27 or an amino acid 
sequence corresponding to the amino acid sequence of SEQ ID NO:27 except that it has 
one or more of the following amino acid substitutions (1) Gly 102 to Ser, (2) Ser 124 to Phe 
(3) He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to Asn, (6) Asp 4 " to Asn, (7) Cvs 610 to 
Tyr, (8) He 6 " to Val, (9) Phe 733 to Ser, (10) lie 735 to Val, (1 1) Phc 24S to Leu, (12) Val 303 
to Leu, (13) Thr 366 to He or (14) Leu 400 to Pro. 

20 1 °- The nuc,eic acid of c,a 'n> 9, wherein the one to two contiguous sequence 

portions comprise at least about 600 amino acids. 



15 



11. A cell as follows: 

(a) ^sformed-with-a-first-vector according to claim 8 and comprisinFsai<r 
25 nucleic acid, or 

(b) transformed with a second vector and comprising a second nucleic acid 
encoding a transporter protein having at least about 99.5% sequence identity with one to 
two contiguous portions of a reference protein sequence which is SEQ ID NO:27 or a 
sequence corresponding to the protein sequence of SEQ ID NO:27 except that it has one 

30 or more of the following amino acid substitutions (1 ) Gly' 02 to Ser, (2) Scr 124 to Phe, (3) 
He 279 to Asn, (4) Arg 393 to Gly, (5) Lys 457 to Asn, (6) Asp 463 to Asn, (7) Cvs 610 to Tyr. 
(8) He 6 " to Val, (9) Phe 733 to Ser, (10) lie 735 to Val, (11) Phe 245 to Leu, (12) Val 30 ' to 
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Leu. (13) Thr 366 to He or (14) Leu 400 to Pro. wherein the encoded protein has glycine 
transporter activity. 

12. A method of producing a glycine transporter comprising growing the 
5 cells of claim 11. 

13. The method of claim 12 further comprising at least one of (a) isolating 
membranes from said cells, which membranes comprise the glycine transporter or (b) 
extracting a protein fraction from the cells which fraction comprises the glycine 

10 transporter. 



14. An glycine transporter isolated from a cell according to claim 1 1 and 
expressed by said first or second extrinsically-derived nucleic acid 

15 15. A method for characterizing a bioactive agent for treatment of a nervous 

system disorder or condition or for identifying bioactive agents for treatment of a nervous 
system disorder or condition, the method comprising (a) providing a first assay 
composition comprising (i) a cell according to claim 10 or (ii) an isolated glycine 
transporter protein comprising the amino acid sequence encoded by said first or second 

20 extrinsically-derived nucleic acids, (b) contacting the first assay composition with the 
bioactive agent or a prospective bioactive agent, and measuring the amount of glycine 
transport exhibited by the assay composition. 



16. The method of claim T57Turther comprising comparing the amount of 
25 glycine transport exhibited by the first assay composition with the amount of glycine 
transport exhibited by a second such assay composition that is treated the same as the 
first assay composition except that it is not contacted with the bioactive agent or 
prospective bioactive agent. 

30 17. The method of claim 15, wherein the nervous system disorder or 

condition is one of the group consisting of (a) pain, (b)spasticity, (c) myoclonus, (d) 
muscle spasm, (e) muscle hyperactivity or (0 epilepsy. 
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18. The method of claim 17. wherein the spasticity is associated with stroke, 
head trauma, neuronal cell death, multiple sclerosis, spinal cord injury, dystonia. 
Huntington's disease or amyotrophic lateral sclerosis. 

19. A nucleic acid that hybridizes with a reference nucleic acid sequence 
which is SEQ ID NO:26 or with a sequence that varies from the nucleic acid sequence of 
SEQ ID NO:26 by having one or more of the following nucleotide substitutions (a) T 6 to 
C, (b) G 30J to A, (c) C"' to T, (d) C 571 to T, (e) T 836 to A, (0 A" 16 to G, <g) A 1 ' 77 to 
G, (h) G 137 ' to C, (i) G 1387 to A, G) G 1829 to A, (k) A 1831 to G, (I) G 2 ' 03 to A, (m) T 21 " 1 
to C, (n) A 22 " 3 to G, (o) C 3 « to G, (p) C 352 to T, (q) T 733 to C, (r) A 777 to G, (s) G 9 ' 3 to 
C (t) G 9M to A, (u) C 1097 to T or (v) T> 199 to C, under conditions of sufficient stringency 

i oexc,udc hybridizations with (a) the sequence for a rat or mouse GlyT-2 transporter or 

(b) the sequence for a mammalian GlyT-l transporter. 

15 20 Thc nucleic acid sequence of claim 19, wherein the nucleic acid is a PCR 

primer and the stringent conditions are PCR conditions effective to amplify a human 
GlyT-2 sequence but not to amplify (a) the sequence for a rat or mouse GlyT-2 
transporter or (b) the sequence for a mammalian GlyT-l transporter. 



10 



20 



21. A nucleic acid of at least about eighteen nucleotides in length comprising 
a contiguous sequence from the coding or noncoding strand of a human GlyT-2 gene or 
cDNA, wherein the contiguous sequence has at least 1 nucleotide difference when 



compared with the rat GlyT-2 gene sequence that aligns with said contiguous sequence. 



25 22 An antisense molecule comprising a contiguous sequence from a coding 

or non-coding strand of a human gene or cDNA for GlyT-2 which is effective when 
administered to a cell, tissue, organ or animal to reduce the expression of GlyT-2 in thc 
cell or in a cell of the tissue, organ or animal, wherein the contiguous sequence has at 
least 1 nucleotide difference when compared with the rat GlyT-2 gene sequence that 

30 aligns with said contiguous sequence. 

23. The antisense molecule of claim 22, wherein the contiguous stretch is 
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included in the coding or non-coding strand of the nucleic acid sequence of SEQ ID 
NO:26 or of a sequence that varies from the nucleic acid sequence of SEQ ID NO:26 by 
having one or more of the following nucleotide substitutions (a) T 6 to C, (b) G 3<M to A r 
(c) C 371 to T, (d) C 571 to T, (e) T* 36 to A, (0 A 1116 to G, (g) A 1177 to G, (h) G 1371 to C ? 
(i) G 1387 to A, (j) G 1829 to A, (k) A 1831 to G, (I) G 2103 to A. (m) T 2198 to C, (n) A 2203 to 
G, (o) C 2 * 2 to G ? (p) C 352 to T, (q) T 733 to C, (r) A 777 to G. (s) G 913 to C, (t) G 951 to A 5 
(u)C 1097 to Tor (v) T n " to C. 



24. An expression vector comprising the nucleic acid of claim 22. 

10 

25. A method of reducing G1yT-2 expression in a tissue or cell comprising 
applying to the tissue or cell (a) a nucleic acid of claim 22 in an amount effective to 
reduce GlyT-2 expression or (b) an expression vector for expressing said nucleic acid in 
said tissue or cell in an amount effective to reduce GlyT-2 expression. 

15 

26. A method of treating a nervous system disorder or condition comprising 
applying to a tissue or cell of a human patient a nervous system disorder or condition 
treating effective amount of a nucleic acid of claim 22 or a nervous system disorder or 
condition treating effective amount of an expression vector for expressing said nucleic 

20 acid in said tissue or cell. 



27. A method for detecting whether an animal has autoimmune, antibodies 
against a glycine transporter, the method comprising contacting an antibody preparation 
fy^xi - the animal or a bodyTluid from the animal with a polypeptide antigen comprising a 
25 glycine transporter or derived from the glycine transporter. 
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Alignment of human and rat GlyT-2 cDNA Sequences 



Match t 89.0 



human 



10 20 30 40 SO so 



A ^^^ G " CC ^ GG ^ TG ^ T ^ CTGC ^ GCCAACA GCCCGGAGCCGGCG 
rat tteWTTCmCTGCTCCCAAO^ . ' 

210 220 "0 240 2S0 260 



. 70 60 90 100 110 120 

human ccggcgcagggccacccggatgccccatgcgctcccaggacgagcccggagcaggagct? 
rat acg ^ccgggccaccgggatagccctcgagcacctaggaccag^cctgag<^ggat^ 

270 280 290 300 3io 3 2 o 

"° 140 ISO l6 0 170 

human CCCGCCGCTGCCCCC-CCGCCGC CGCCACGTGTCCCCACCTCCGCTTCCACC 

: : : : : : : : : : : : : : : : : : : : : : 

ra t CCTGCGGCAGCCCCCGCGGCCGCTGTCCAGCCGC^CGTCTGCC^GGTCGGCTTCCACC 
330 340 3 S0 360 370 3B0 

180 190 200 210 220 230 

human CCCGCCCAWCTT1CCMTW 

rat GGCGCCCAAACTTTCCAGTCTGCGGATGCGAGAGCCTGTGAGGCACAGCGGCCT 

390 400 420 430 440 

. 240 250 260 270 2B0 2 q 0 

human GGGTCTTCCAAACTCMTAGCCCGCGGGCGC^^ 



GGGTTTTGTAAACTTAGCAGCCCCCAGGCACAAGCGACCTCTGCGGCCCTCCG^ 
450 460 «™ 4B0 490 



rat 

•owwwvvw^wwuuvwMbOiACCTCTI 

450 A C l> ml* 

490 S00 

300 310 320 330 340 -> c « 

-human AGAGAGCCGCAAA6CGCGC»GGCCTCG^^ 

• • 5 ? I 5 • 

rat 



510 S20 530 S40 550 560 



human CTGCACTCTAAGATCC^ 
rat 



tta^ctgcaagawccagctctgcgtggccc(Uacgag^ 

570 S8 ° 590 «00 610 620 



420 430 440 450 460 

human ^GGCACCCTGGAG^^ 

rat A^jACGCTG^CAC^^ 

640 650 ««0 670 680 

FIG. 3 
(1 of 5) 
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4B0 490 500 510 520 530 

ACCGTGGTCCTGGGCACGGATGGAATCACGTCCGTGCTCCCGGGCAGCGTGGCCACCGTT 

ACAGTGGTCrTGGGTACroATGGAATCGCGTCGGTGCTCCCGGGCAGCGTGGCCAC^ 
6 90 700 710 720 730 740 

5 «° 550 560 570 580 590 

GCCACCCAGGAGGACGAGCAAGGGGATGAGAATAAGGCCCGAGGGAACTGGTCCAGCAAA 
::: t : : : . . . . . s . . . . s::ss::s:s . . 

ACCATTCCGGAGGACGAGCAAGGGGATGAGAATAAGGCCAGAGGGAACTGGTCCAGCAAA 
750 760 770 780 790 800 

500 610 620 630 640 650 

CTGGACTTCATCCTGTCCATGGTGGGGTACGCAGTGGGGCTGGGCAATGTCTGGAGGTTT 

CTGGACTTCATCCTGTCCATGGTGGGGTACGCAGTGG^^ 

810 820 630 840 8S0 860 

660 670 680 630 700 710 

CCCTACCTGGCCTTCCAGAACGGCGGAGGTGCTTTCCTCATCCCTTACCTGATGATGCTG 

CCCTACCTGGCCTTCCAGAACGGGGGAGGTGCTTTCCTCATCCCTTACTTGATG^ 
970 880 890 900 910 920 

720 7 30 740 750 760 770 

GCTCTGGCTGGATTACCCATCTTCTTCTTGGAGGTGTCGCTGGGCCAGTTTGCCAGCCAG 

: 1 : : : : : : : : * : : : : ::::::::: : 

GCACTGGCTGGCTTACCTATCTTCTTCCTAGAGGTGTCCCTGGCCCAGTTTGCi^GC^G 
930 940 950 960 970 9B0 



human 
rat 

human 
rat 

human 
rat 

human 
rat 

human 
rat 



human 



780 790 800 810 820 830 

GGACCAGTGTCTGTGTGGAAGGCCATCCCAGCTCTACAAGGCTGTGGCATCGCGATGCTG 

Jli;i;j;ci:;tt;;;;»*« 

rat GGTCCTGTGTCTGTGTGGAAGGCCATCCCAGCTra 

990 1000 1010 1020 1030 1040 

840 850 660 -870 880 890 

ATCATCTCTCTCCTAATAGCCATATACTACAATGTGATTATTTGCTATACACTrTTCTAC 

-ATCATCTCCCTCCTCATAGCCATCTACTACAACGTCATCATCtK^ 

10S0 ««0 10-JO 1080 1090 iioo 



" human 
jrac 



human 



900 910 «0 930 940 9S0 

CTGTTTGCCTCCTTTGTGTCTGTACTACCCTGGGGCTCCTCCAAC^CCCTTGGAATACG 

rat CTGTTTGCTTCTTTTGTGTCTGTGCTGCCCTGGGGATCCTGCAACAAC^ 

1110 1120 1130 ii 4 o HSO iieo 

960 970 980 990 1000 1010 

human CCAGAATGCAAAGATAAAACCAAACTTTTATTAGATTCCTGTfiTTn'pr » ts*v* * 



rat CCAGAATGCAAAGACAAAACCAAACTTTTACTAGATTCCTGTGTTATCGGTGACCATO 



'ATTAGATTCCTGTGTTATCAGTGACCATCCC 

CC 



1170 1180 1200 1210 1220 

FIG. 3 
(2 of 5) 
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human 
rat 
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1020 1030 1040 

AAAATACAGATCAAGAACTCGACTTTCTi 



1050 1060 

GCATGACCGCTTATCCCAACGTGACAATGGTT 



1080 



1260 1270 
1110 



1280 



AACTTCACCAGCCAGGCCA^TAACACA^TCTCA 1 ? : : : : : : : : — = 

"90 13 00 1310 ^ TCT ^^ GGGAGTC ^GTACTTCAAGTAC 



1320 



1330 



1340 



1140 



1150. 



1160 



1380 i39o 



rac 



1350 



1360 



1370 



1400 



human 
rat 

human 
rat 

human 
rat 



human 
rat 



1200 



1210 



CTCTG CCTCTTCCTGG CTTGGGTCATTGTGTATGCA' 2 ^ ° 



1250 

TCGTTGCCTAAAGGAATCAAGACT 



1430 



1440 



1450 



AAGACA 
1460 



1260 



1270 



TCAGGAAAAGTGGTCTACTTCACGGCCACGTTCi 



ccgtatgto;tactcgtgatcctcctc 



1500 1510 1S20 

1320 1330 1340 

ATTCGAGGGGTCACCCTGCCTGGAR^A^A—^-ill: : : : : : : : : : = • : : : : : : : 



1530 1540 isso 

1380 1390 



GAGCTGGAGCCGGTATCTGGTACTTCJ 



^560 1570 
1400 i« 10 



ATCACACCTAAG 
1S80 



0 

CTCT 



1620 i 6 3 0 164Q 



-1590- 



human 
rat 

human 
rat 



1440 



1450 



1460 



1500 



"70 1680 
1S20 



1590 



1700 



FIG. 3 
(3 of 5) 



1760 



WO 98/07854 



PCT/US97/14637 



6/11 



human 
rat 

human 
rat 

human 
rat 

human 
rat 

human 
rat 

human 
rat 

human 
rat 

human 
rat 

human 
rat 



1560 1570 1S80 1S90 1600 1610 

TTCGTCATCTTCTCCGTTATCGGCTTCATGGCCAATGAACGCAAAGTCAIVCA1TGAGAAT 

TTTGTCATCTTCTCTGTCATTGGm^ 

1770 1780 1730 1800 1810 



1820 



1620 



1630 



1640 



1650 1660 1670 

GTGGCAGACCAAGGGCCAGGCATTGCATTTGTGGTTTACCCGGAAGCCTTAACCAGGCTG 

::::: ::::::: :: : : : : : 

GTGC CTGACCAAGGGCCAGGCATTGCATTTCTGGTTTACCCAGAAG CCTTAACCAGG CTG 
1630 1840 1850 I860 



1870 



1BB0 



1680 



1690 



1700 1710 1720 1730 

CCTCTCTCTCCGTTCTGGGCCATCATCTTTTTCCT 

CCTCTCTCTCCATTCTGGGCCATCATCTTTTTCCTGATGCTTC^ 
1890 1900 1910 1920 1930 



1940 



1740 



1750 



X760 1770 1780 1790 

ACTATGTTTGCCACCATCGAGACCATAGTGACCTCCATCTCAGACGAGTTTCCCAAGTAC 

ACCATGTTTGCTACCATCGAGACCATTGTGACCT^ 
1950 1960 1970 1980 



1990 



2000 



1800 



1810 



1820 1B30 1840 18S0 

CTACCCACACACAAGCCAGTGTTTACTCTGGG CTG CTGCATTTGTTTCTTCATCATGGGT 

ct<x:gc^cacacaagcctgtgttc^ccctgggctgctg^tctc 

2010 2020 2030 2040 2050 



2060 



1860 



1870 



1880 1890 1900 1910 

tttccaatgatcactcagggtggaatttacatgtttcagcttgtggacacctatgctgcc 

:: ------ ::::::::::: : s : s ■ : . : s . . : . . - : s . . . . . 

2100 2110 2120 



2070 



2080 



2090 



1920 



1930 



1940 



1950 1960 1970 

TCCTATGCCCTTGTC^TCATTGCC^^ 

:: :: : .. 

TCCTATGCTCTTGTCATCA^ 



2130 



2140 



2150 



2160 2170 2180 

1980 1990 2000 2010 



2020 



TTGCAAAGATTCTGTGAAGATATAGAGA^ 

:: ::::::::::: :: 

"GCMJUSGTTCTeiGJUUiACIlT^^ 

2190 2200 2210 2220 



2230 



2240 



2040 



2050 



2060 



2070 2080 2090 

225 D 2260 2270 2280 



2290 



2300 



FIG. 3 
(4 of 5) 



WO 98/07854 



7/11 



PCT/US97/M637 



hu^an TACCAaiL' "30 2140 



rac 



TAT^GTGGOAGCC^ 

2320 2330 2340 2350 2360 

2160 2170 2180 21QO 

human GGATGGCTAATnrTrrrr^n^^i!" 0 2200 2210 



rat 



rat 



rat 



GGATGGCTGAT«rfr r=rv«r/*r»^**«~»_ 1 11111* ' * " " : : : : ; : : = 



2220 2230 2240 



2370 2360 2)^ TJo ji™^ 



2420 



u 2240 2250 

"""" "T?!?? 0 "" 0 ^™™*"^^ 

2280 2250 2300 23io 

human ^^"TTCTT^CTCA^^ 

2340 2350_ 2360 2370 



human TAGTCC 



rat TAGTCC 
26X0 



FIG. 3 
(5 of 5) 



WO 98/07854 PCT/US97/14637 

8/11 

Alignment of human and rat GlyT-2 amino acid sequences 
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